From 8d4dee5da3816bc35fd92ced807fcc2041e65710 Mon Sep 17 00:00:00 2001 From: MajorLinux Date: Sun, 7 Jun 2026 05:56:25 -0400 Subject: [PATCH] troubleshooting: correct ssh tailscale-race article (Fedora ListenAddress variant + playbook cycle landmine) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Fedora hosts are NOT automatically immune: a leftover manual `ListenAddress ` drop-in reintroduces the sshd boot bind-race even under firewalld (hit on majordiscord 2026-06-07; fix = remove it). - The Ubuntu playbook kept shipping the cycle-causing [Unit] gate on ssh.socket despite the 2026-06-04 resolution; re-running it re-armed the ordering cycle (clobbered majorlinux; majortoot-hetzner found armed). Corrected in MajorAnsible e0d35aa. Fleet ssh-lockdown state is inconsistent (dcaprod/tttpod lack wait-ready; teelia no override) — needs a per-host audit. --- .../networking/ssh-socket-tailscale-race-condition.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md b/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md index f5eb371..c2cbd9a 100644 --- a/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md +++ b/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md @@ -81,7 +81,13 @@ ss -tlnp | grep :22 # verify bound to Tailscale IP ### Affected Hosts -Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. Fedora hosts (majordiscord) use firewall rules for SSH restriction — not affected by this race. +Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. + +> [!danger] The Ubuntu playbook shipped the cycle pattern until 2026-06-07 +> Despite the 2026-06-04 resolution above, `configure_tailscale_ssh_only.yml` in the repo kept deploying the `[Unit] Requires=tailscale-wait-ready.service` gate on **ssh.socket** (the cycle-causer) and never added the ssh.service gate — so re-running it *re-armed* the ordering cycle. Caught 2026-06-07: it clobbered majorlinux's hand-fix, and **majortoot-hetzner was found already armed** with the latent cycle (would have lost SSH on its next reboot). Both restored/defused; playbook corrected in MajorAnsible `e0d35aa` (gate on ssh.service, dependency-free socket). ⚠️ dcaprod-hetzner / tttpod-hetzner lack `tailscale-wait-ready.service` and teelia has no socket override — the Ubuntu SSH-lockdown state is **inconsistent across the fleet and needs a deliberate per-host audit**. + +> [!warning] Fedora hosts are NOT automatically immune (corrected 2026-06-07) +> The firewalld method (`configure_tailscale_ssh_only_fedora.yml`) binds sshd on `0.0.0.0:22` and enforces Tailscale-only via the firewall, so it has no dependency on the Tailscale address — **unless** a host also carries a leftover manual `ListenAddress ` drop-in (`/etc/ssh/sshd_config.d/tailscale-only.conf`) from the pre-firewall lockdown. Then sshd.service hits the same boot bind-race (`Bind to port 22 on failed: Cannot assign requested address`) and flaps every reboot. Hit on **majordiscord 2026-06-07**; fixed by removing the redundant drop-in (firewall stays the enforcing layer). The Fedora playbook now removes it automatically (MajorAnsible `b4a9090`). ---