diff --git a/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md b/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md index f5eb371..c2cbd9a 100644 --- a/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md +++ b/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md @@ -81,7 +81,13 @@ ss -tlnp | grep :22 # verify bound to Tailscale IP ### Affected Hosts -Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. Fedora hosts (majordiscord) use firewall rules for SSH restriction — not affected by this race. +Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. + +> [!danger] The Ubuntu playbook shipped the cycle pattern until 2026-06-07 +> Despite the 2026-06-04 resolution above, `configure_tailscale_ssh_only.yml` in the repo kept deploying the `[Unit] Requires=tailscale-wait-ready.service` gate on **ssh.socket** (the cycle-causer) and never added the ssh.service gate — so re-running it *re-armed* the ordering cycle. Caught 2026-06-07: it clobbered majorlinux's hand-fix, and **majortoot-hetzner was found already armed** with the latent cycle (would have lost SSH on its next reboot). Both restored/defused; playbook corrected in MajorAnsible `e0d35aa` (gate on ssh.service, dependency-free socket). ⚠️ dcaprod-hetzner / tttpod-hetzner lack `tailscale-wait-ready.service` and teelia has no socket override — the Ubuntu SSH-lockdown state is **inconsistent across the fleet and needs a deliberate per-host audit**. + +> [!warning] Fedora hosts are NOT automatically immune (corrected 2026-06-07) +> The firewalld method (`configure_tailscale_ssh_only_fedora.yml`) binds sshd on `0.0.0.0:22` and enforces Tailscale-only via the firewall, so it has no dependency on the Tailscale address — **unless** a host also carries a leftover manual `ListenAddress ` drop-in (`/etc/ssh/sshd_config.d/tailscale-only.conf`) from the pre-firewall lockdown. Then sshd.service hits the same boot bind-race (`Bind to port 22 on failed: Cannot assign requested address`) and flaps every reboot. Hit on **majordiscord 2026-06-07**; fixed by removing the redundant drop-in (firewall stays the enforcing layer). The Fedora playbook now removes it automatically (MajorAnsible `b4a9090`). ---