majorwiki/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md
majorlinux 7dc591d257 wiki: add ssh.socket Tailscale race condition troubleshooting article
Documents the systemd socket activation race where ssh.socket binds
to the Tailscale IP before tailscaled is ready, causing SSH to become
unreachable after a Tailscale reconnect. Includes diagnosis steps and
the After=/BindsTo= fix.
2026-05-19 19:35:16 -04:00

2.2 KiB

ssh.socket Unreachable After Reboot (Tailscale Race Condition)

Symptom

SSH to a host via Tailscale IP times out. tailscale ping works, tailscale status shows active; direct, but SSH on port 22 refuses connections. No access via Hetzner console if root password is unset.

Cause

Ubuntu 24.04 uses systemd socket activation for SSH (ssh.socket instead of persistent ssh.service). When the socket override binds to a Tailscale IP, it can start before tailscaled.service is ready. The bind may succeed initially (Tailscale state file caches the IP), but a later Tailscale reconnect or interface reset invalidates the bound address silently — SSH dies with no recovery path.

Diagnosis

# From another host:
tailscale ping <IP>          # succeeds — host is up
ssh root@<IP>                # times out — sshd not listening

# After gaining console access or reboot:
systemctl status ssh.socket  # check Listen: address
journalctl -b -1 -u ssh     # likely empty — sshd never spawned
journalctl -b -1 -u ssh.socket  # socket started before tailscaled

Fix

Add Tailscale dependency to the socket override:

# /etc/systemd/system/ssh.socket.d/override.conf
[Unit]
After=tailscaled.service
BindsTo=tailscaled.service

[Socket]
ListenStream=
ListenStream=<TAILSCALE_IP>:22

Then reload and restart:

systemctl daemon-reload
systemctl restart ssh.socket
systemctl status ssh.socket   # verify Listen: shows correct IP
  • After= ensures the socket waits for Tailscale to start
  • BindsTo= restarts the socket if Tailscale restarts, preventing stale binds

Prevention

  • Set root passwords on all Hetzner hosts for emergency console access
  • Ansible playbook configure_tailscale_ssh_only.yml includes both directives as of commit 7ef182b

Affected Hosts

Ubuntu hosts using configure_tailscale_ssh_only.yml: majorlinux, dcaprod-hetzner. Fedora hosts (majordiscord) use firewall rules for SSH restriction — not affected.

References