Skip to content

Fix Linux WireGuard full-tunnel reconnect: clear stale wgsent0 interface#34

Open
Sentinel-Bluebuilder wants to merge 1 commit into
masterfrom
fix/linux-wireguard-stale-interface
Open

Fix Linux WireGuard full-tunnel reconnect: clear stale wgsent0 interface#34
Sentinel-Bluebuilder wants to merge 1 commit into
masterfrom
fix/linux-wireguard-stale-interface

Conversation

@Sentinel-Bluebuilder

Copy link
Copy Markdown
Owner

Consumer-app context

Found during the x402 AI-agent dVPN end-to-end test (2026-06-22). An agent using the JS connect({ protocol: 'wireguard' }) full-tunnel path on Linux hit:

wg-quick: 'wgsent0' already exists

on reconnect after a previous run crashed/was killed. wg-quick down wgsent0 did not remove the interface, so every subsequent connect was wedged until a manual teardown. The tester's workaround was to pass splitIPs: ["0.0.0.0/0", "::/0"] — this PR fixes the underlying defect so that workaround is no longer required just to reconnect.

Root cause

The Linux install pre-down ran only wg-quick down <name>. wg-quick resolves a bare name against /etc/wireguard/<name>.conf — but the SDK writes the WireGuard config to a temp dir (os.tmpdir()/sentinel-wg). So wg-quick down wgsent0 found no matching config, did nothing, and the stale kernel interface from the crashed run survived, blocking the next wg-quick up.

Fix (wireguard.js)

New teardownLinuxInterface(name, confPath) removes a leftover interface in three escalating, best-effort steps (none throw):

  1. wg-quick down <confPath> — covers the temp-dir conf case (the actual bug)
  2. wg-quick down <name> — covers /etc/wireguard installs
  3. ip link delete dev <name> — last resort for a wedged kernel interface wg-quick can no longer map back to a config (e.g. temp conf already deleted)

Called from both the install pre-down and emergencyCleanupSync(), so exit-handler cleanup also clears a wedged interface. emergencyCleanupSync's hardcoded /tmp/... path is replaced with os.tmpdir() to match where configs are actually written.

Doc (same diff — per SDK PR workflow)

ai-path/FAILURES.md gains TUNNEL entry T17 (failure / what happened / root cause / fix / prevention rule). An AI reading only the docs after this merge knows the temp-dir-conf teardown requirement.

Verification

  • node --check wireguard.js passes.
  • Behavior is additive cleanup on a path that previously no-op'd; Windows path unchanged.
  • Live Linux full-tunnel reconnect-after-crash verification recommended before tagging a release (no Fedora — SELinux blocks the interface regardless).

Consumer-app context (x402 AI-agent dVPN end-to-end test, 2026-06-22):
an agent connecting via the JS WireGuard full-tunnel path on Linux hit
`wg-quick: 'wgsent0' already exists` on reconnect after a crashed run.
`wg-quick down wgsent0` did not remove it, so reconnect was wedged until a
manual teardown. The tester's workaround was to pass
splitIPs: ['0.0.0.0/0','::/0']; this fixes the underlying defect so the
workaround is no longer required for reconnect.

Root cause: the Linux install pre-down ran only `wg-quick down <name>`.
wg-quick resolves a bare name against /etc/wireguard/<name>.conf, but the
SDK writes the WireGuard config to a TEMP dir (os.tmpdir()/sentinel-wg).
So the down found no matching config, did nothing, and the stale kernel
interface from the crashed run survived — blocking the next `wg-quick up`.

Fix: new teardownLinuxInterface(name, confPath) tears a leftover interface
down in three escalating steps —
  1. `wg-quick down <confPath>`  (covers the temp-dir conf case)
  2. `wg-quick down <name>`       (covers /etc/wireguard installs)
  3. `ip link delete dev <name>`  (last resort: raw kernel interface that
     wg-quick can no longer map back to a config)
Each step is best-effort and never throws. Used by both the install
pre-down and emergencyCleanupSync() so exit-handler cleanup also clears a
wedged interface. emergencyCleanupSync's hardcoded /tmp path is replaced
with os.tmpdir() to match where configs are actually written.

Doc (same diff): ai-path/FAILURES.md gains TUNNEL entry T17 documenting the
failure, root cause, fix, and prevention rule.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant