# macOS VPN · per-app routing

`05 · macos-vpn · Personal tool`

Per-app macOS routing tool with fail-shut pf gate, launchd boot persistence and live traffic monitoring.

**Scope:** Solo · ongoing  
**Role:** macOS systems engineering · CLI tooling

---

## Context

> The kill-switch isn't a watchdog. It's the absence of a route.

Developer tooling that depends on a stable per-app egress IP — the protected app must always exit through the same proxy node, while every other process on the machine stays direct. Off-the-shelf VPN clients tunnel everything; CIDR or DNS-based split-tunneling cannot help when the same destination needs the proxy for one process and direct for another. Per-process granularity in the right combination is not in any popular consumer GUI.

For the protected lane, fail-open after a tunnel drop is louder than the drop itself — a single second of direct egress flags the receiving side. A watchdog kill-switch leaves a race window between sing-box exit and the watchdog reacting; fail-open in that window is the failure mode the protected lane cannot tolerate.

## Facts

| | |
|---|---|
| **Scope** | Solo · daily driver |
| **Surfaces** | macOS CLI · self-installing pf anchor · launchd boot persistence |
| **Routing** | sing-box TUN + process_name dispatch · 3 protected app groups |
| **Protocols** | VLESS+Reality (TCP stealth) · Shadowsocks (TCP+UDP) — chosen per session |
| **Monitoring** | Clash API @ 127.0.0.1:9090 · per-app speed · 3-tier sparklines (4 min · 1 hour · session) |
| **Status** | Personal tool · daily driver |

## Architecture

### Per-app routing through TUN

```text
   Protected processes                    every other process
   (3 app groups · process_name match)        │
        │                                     │
        ▼                                     ▼
   sing-box  TUN-mode  (utun99 · 172.19.0.1/30)
        │  intercepts ALL system traffic
        │
        │  route.rules:
        │    if process_name in VPN_PROCESSES  →  outbound: proxy
        │    else                              →  outbound: direct
        ▼
   ┌─── proxy outbound ──── VPN provider ──── internet
   │
   └─── direct outbound ──── en0 ──── internet
```

**Dispatch by process, not destination.** Routing rules in sing-box (route.rules) match on process_name, not destination IP or domain. Two "network spaces" share one host: protected apps always exit through proxy, everything else exits direct — even when both reach the same destination.

**TUN intercepts everything.** sing-box runs in TUN-mode (auto_route: true) — every packet from every process passes through utun99. Without this, dispatch by process_name is impossible: kernel route tables work on destinations, not process owners.

**DNS split inside sing-box.** Protected processes get DoH via proxy-dns (1.1.1.1 over HTTPS through the proxy); everything else hits direct-dns (1.1.1.1 plain UDP). A DNS-hijack route rule intercepts system DNS queries before they leave the box. Without per-process DNS, VPN-host resolution would leak via the local resolver.

### Negative-space pf gate · state matrix

```text
  state             en0 traffic owner    fires                outcome
  ─────             ─────────────────    ─────                ───────
  VPN running       sing-box (root)      pass quick user 0    OK · proxy + direct work
  VPN crashed       user-app (uid≠0)     block drop default   no internet for user
  boot, no mvpn     user-app (uid≠0)     block drop default   no internet for user
  mvpn  (root)      mvpn (root)          pass quick user 0    subscription / pings work

  rules in /etc/pf.conf via anchor "singbox-killswitch":

      pass quick on lo0 all
      pass quick on utun99 all
      pass quick on { en0 en4 en5 en6 } from any to any user 0
      pass quick on { en0 en4 en5 en6 } proto { tcp udp } to any port 53     # DNS
      pass quick on { en0 en4 en5 en6 } proto udp from any port 68 to 67     # DHCP
      pass quick on { en0 en4 en5 en6 } proto udp to 224.0.0.251 port 5353   # mDNS
      block drop on { en0 en4 en5 en6 } proto { tcp udp } all                # default-deny
```

**Kill-switch is an absence, not an action.** No watchdog process. No "monitor sing-box, then call pfctl block" loop. The block rule is loaded once; it fires whenever a non-root socket tries to write to a physical interface. When sing-box dies, user-apps fail-shut by default — there is nothing to react.

**Discovery-level explicit allows.** lo0 + TUN + DNS + DHCP + mDNS pass-through is required for first connect. Without DNS the very first sudo mvpn could not resolve the VPN host; without DHCP a fresh boot could not lease an IP. The block applies only to non-root TCP/UDP on Ethernet — discovery primitives stay open.

**Multi-interface match.** Rules apply to en0 + en4 + en5 + en6 — Wi-Fi plus three USB-Ethernet adapter slots. Plugging in a tethered phone or a USB-C hub does not bypass the gate.

### Protocol picker · kill-switch active

![Protocol selection screen — VLESS+Reality (TCP-only stealth) vs Outline / Shadowsocks (TCP+UDP). Default: Outline.](https://ilyadev.xyz/private/macos-vpn-start.webp)

*Every start: stealth or full transport*

![Banner shown when sing-box stops — "INTERNET BLOCKED — kill switch active" with recovery commands listed below.](https://ilyadev.xyz/private/macos-vpn-killswitch.webp)

*Fail-shut state surfaced explicitly*

**Default: Shadowsocks.** Outline (Shadowsocks) handles full TCP+UDP — gaming, voice, video calls all work. VLESS+Reality is the stealth pick when full transport is not required (see §03 Decisions for the trade-off). Choice is per session, never automatic.

**Recovery is three commands.** sudo mvpn re-connects (re-fetches subscription, re-selects server). sudo mvpn kill-apps force-closes the protected processes (SIGTERM → SIGKILL) before opening the gate. sudo mvpn disable removes the pf anchor — internet returns direct, but if a protected app is still running its traffic now exits direct.

**Disable confirms before opening.** sudo mvpn disable checks for live protected processes before tearing down the anchor. If any are running it asks "Kill? [y/N]" — refusing means the user must close them manually before the gate opens. Without the prompt, disable would silently put protected apps onto the direct path.

### Boot lifecycle

```text
   macOS boot
        │
        ▼
   launchd  (RunAtLoad: true)
        │  com.mvpn.killswitch.plist  →  /Library/LaunchDaemons/
        │  ProgramArguments:
        │    pfctl -a singbox-killswitch -f killswitch.pf.conf
        ▼
   pf anchor "singbox-killswitch" loaded
        │  rules active · no internet for user-apps
        ▼
   user runs  sudo mvpn
        │  fetch subscription  (root → pass quick)
        │  parallel TCP ping → pick best server
        │  generate config.json  →  sing-box check  →  start
        ▼
   sing-box up  (PIDFILE written, TUN online)
        │  process_name dispatch live
        │  Clash API on :9090 ready
        ▼
   live-status loop  (urllib → /connections → render)
```

**enable vs mvpn — orthogonal commands.** sudo mvpn enable runs once per machine: copies com.mvpn.killswitch.plist into /Library/LaunchDaemons/, registers the launchd daemon, installs the pf anchor in /etc/pf.conf. sudo mvpn runs every session: subscription → ping → start. After enable, every reboot blocks internet until sudo mvpn is run.

**Self-installing pf anchor.** pf.py:_ensure_anchor_in_main() reads /etc/pf.conf, appends anchor "singbox-killswitch" if absent, runs pfctl -f /etc/pf.conf to reload. Idempotent — re-running enable is a no-op if the line is already there. Removed by disable via _remove_anchor_from_main().

**Boot-time pf log.** launchd writes pfctl stdout/stderr to pf-launch.log — first thing to check if the anchor failed to load on boot (corrupted config, syntax error after a manual edit).

## Key engineering decisions

### 01 · Negative-space pf gate as the kill-switch

**Decision.** pf rules: pass quick on en0/en4-6 user 0 (root traffic, including sing-box) + block drop on en0/en4-6 default for everything else, with explicit pass-throughs for lo0, utun99, DNS, DHCP, and mDNS. Kill-switch behaviour is the side-effect of the default-deny rule — there is no watchdog, no liveness check, no reactive block step.

**Why.** A reactive watchdog kill-switch ("monitor sing-box; if dead, call pfctl block") leaves a race window — between sing-box exit and the watchdog reacting, user-apps drop straight onto en0. For a protected lane where any leak is louder than the drop itself, the gate has to work as a property of the system, not as an action triggered by an event. Default-deny + explicit allows compresses safety into the routing table itself: nothing has to "work right" for the block to apply — the absence of a route is the block.

**Cost.** pf rules require root and live in /etc/pf.conf via a self-install path (_ensure_anchor_in_main()); debugging "why no internet" goes through pfctl -s rules, not through application logs. No per-app fail-open — the gate is binary across the protected set. Discovery primitives (DNS, DHCP, mDNS) need explicit pass-throughs or first-connect breaks; the pf rule list is no longer trivially short.

### 02 · process_name dispatch in TUN mode (not CIDR / DNS split-tunnel)

**Decision.** sing-box runs in TUN mode (auto_route: true) — every packet from every process flows through utun99. Two route rules dispatch: process_name in VPN_PROCESSES → outbound: proxy; otherwise final: direct. Both rules see the same destination set; the only differentiator is the process owner.

**Why.** The use case is per-app static IP, not "traffic to host X via VPN". CIDR or DNS-based split-tunneling breaks when the same domain needs the proxy from process A and direct from process B — both rules match, only one wins, the other process either leaks or fails. process_name dispatch is the only mechanism that keeps two distinct egress paths for two distinct local processes hitting the same destination.

**Cost.** TUN intercepts everything — if sing-box hangs, system networking goes with it. Dispatch is per-packet (overhead is small but measurable on heavy traffic). VPN_PROCESSES is a manual list in config.py — adding an app means running ps -eo comm | grep <name> and editing the dict. No GUI, no auto-discovery yet.

### 03 · Two protocols at session start (VLESS+Reality vs Shadowsocks)

**Decision.** Every sudo mvpn prompts: VLESS+Reality (TCP-only, looks like HTTPS to DPI) or Shadowsocks (TCP+UDP, simpler obfuscation). Default is Shadowsocks. Choice persists for the session; switching protocols means stop + restart.

**Why.** Stealth and UDP are mutually exclusive in this stack. VLESS+Reality with flow=xtls-rprx-vision is TCP-only — Reality cannot proxy UDP at all. The route rule in VLESS mode falls UDP from protected processes back to outbound: direct (visible at singbox.py:69-78). For a use case that needs UDP through the proxy (gaming, voice), that is a real-IP leak; for a TCP-only use case, VLESS-mode stealth is worth the UDP-direct fallback. Shadowsocks proxies both. One protocol cannot serve both shapes; auto-switching would hide a security-relevant decision behind heuristics.

**Cost.** Two outbound branches in singbox.py — _make_vless_outbound and _make_shadowsocks_outbound. UX cost: an extra prompt on every session start. The user must know what they picked — mvpn status surfaces the active protocol but the picker itself is the only checkpoint where the choice happens.

### 04 · launchd-mounted pf anchor with self-install

**Decision.** sudo mvpn enable (once per machine) installs com.mvpn.killswitch.plist into /Library/LaunchDaemons/ and the pf anchor into /etc/pf.conf. After enable, every reboot reloads the pf rules before any user-space app comes up. sudo mvpn is a separate per-session command — fetch subscription, pick server, start sing-box.

**Why.** Post-reboot is the danger window — auto-launch apps can reach networking before the user types sudo mvpn. A run-on-demand kill-switch (load rules at start, drop them at stop) leaves protected apps fail-open until the user remembers to start the VPN. Mounting the anchor at boot inverts the default — internet is blocked by default; turning it on is the conscious step. Splitting enable from mvpn keeps the per-session command short and the per-machine setup explicit.

**Cost.** enable writes to /etc/pf.conf and /Library/LaunchDaemons/ — both require sudo and document the install as a known surprise ("no internet after reboot until sudo mvpn"). The plist hardcodes an absolute path to killswitch.pf.conf — not portable to other accounts without templating.

### 05 · Per-process DNS split inside sing-box

**Decision.** sing-box config defines two DNS resolvers — proxy-dns (DoH, 1.1.1.1 over HTTPS, routed through the proxy) and direct-dns (1.1.1.1 over plain UDP, direct). A DNS rule routes queries from VPN_PROCESSES through proxy-dns; everything else falls to direct-dns as the final. A separate route rule (protocol: dns, action: hijack-dns) intercepts every DNS request the system tries to make and redirects it to sing-box internal resolution.

**Why.** A single resolver — even fast and reliable — leaks the resolution path. If protected processes resolve VPN-host names through the local system DNS (router, ISP, public 1.1.1.1 plain UDP), the receiving side sees the lookup come from the host's real IP before any proxy connection happens. DoH-through-the-proxy keeps both the resolution and the connection inside the same egress path — the proxy is the only network surface that sees protected-process activity.

**Cost.** sing-box DNS hijacking can fight with apps that pin their own DoH endpoints (browsers with ECH enabled, for example) — the hijack action redirects the query but the in-app DoH client may not respect the redirect. DoH through the proxy adds round-trip latency on every cold lookup compared to plain UDP. Two resolvers in the config double the surface that has to stay healthy.

## Stack

| | |
|---|---|
| **Runtime** | Python 3.12+ · macOS pf · launchd · sing-box (TUN mode) |
| **Protocols** | VLESS+Reality (TCP stealth) · Shadowsocks (TCP+UDP, Outline format) |
| **Dispatch** | sing-box route.rules · process_name per-app match · DNS hijack |
| **Monitoring** | Clash API @ 127.0.0.1:9090 · /connections poll · custom rendering |
| **Persistence** | launchd plist · self-installing pf anchor in /etc/pf.conf |
| **Scale** | ~2k LOC Python · 9 modules · 32-line pf ruleset · 22-line launchd plist |

## Lessons & status

### Carry forward

- Negative-space gate as a design pattern — security expressed as the absence of a route, not the presence of a watchdog. Default-deny + explicit allows beats reactive watch-and-block on every metric that matters: race window, complexity, surfaces that have to "work right".
- Three independent ring-buffers for sparklines (4 min · 1 hour · session, at 1pt per 3s / 45s / 3600s) — each tier answers a distinct question. One buffer with downsampling either loses granularity at short ranges or compression at long ones; three buffers, each tuned for its own time scale, get the best read at every zoom.
- enable separated from session-start command — boot persistence is an orthogonal property, not a side-effect of the connect command. One install, every session stays trivial.

### Would change

- Fetch-once-at-start was the wrong assumption about subscription stability — the server URL is itself part of what the provider rotates, not a constant for the session. Health-check on stalled downloads with re-fetch is the shape I would build first now: subscription-watch, not subscription-fetch.
- APP_GROUPS is a manual dict in config.py. Adding an app means running ps -eo comm | grep <name> and hand-editing the file. Cheap upgrade: an interactive mvpn add-app that lists running processes, lets the user mark which to route through VPN, and rewrites the dict.
- launchd plist hardcodes an absolute path to killswitch.pf.conf. Works for one user. The right shape is enable substituting $HOME into a plist template before copying — costs a few lines, makes the tool portable to any account.

Personal tool · daily driver. Scope is intentionally limited; auto-recovery for subscription rotation and multi-host failover remains a possible next slice. Code walkthrough on request.

---

Source: https://ilyadev.xyz/cases/macos-vpn (HTML) · /cases/macos-vpn.md (this file)
Previous: 04 — Bullet Reign · Roblox → https://ilyadev.xyz/cases/roblox-game.md
Up next: 06 — Portfolio Site → https://ilyadev.xyz/cases/portfolio-site.md
Index: https://ilyadev.xyz/llms.txt — full case-study list
Author: Ilya Kazantsev — https://ilyadev.xyz/index.md
