majorwiki

Author	SHA1	Message	Date
Marcus Summers	06a794316b	docs: point Ansible references at the new roles (clamav/ssh_hardening/tailscale) Operational/how-to references updated to the role entry playbooks after the ADR-0001 migration. Historical incident narrative (dated callouts, commit refs) preserved. - clamav-fleet-deployment: override + re-run -> clamav.yml; role note - ssh-hardening-ansible-fleet: note this is now the ssh_hardening role - vps-migration-baseline-checklist: table -> clamav.yml / ssh_hardening.yml - ssh-socket-tailscale-race-condition: Affected Hosts + Prevention + References -> tailscale role tasks (network_wait/ssh_only_ubuntu/ssh_only_fedora) - freshclam-logwatch-false-no-updates: codify refs -> clamav role	2026-06-11 11:33:42 -04:00
MajorLinux	c3045e33dd	troubleshooting: ssh-race article — fleet audited & reconciled 2026-06-07 dcaprod-hetzner + tttpod-hetzner were missing tailscale-wait-ready.service (inert ssh.service gate -> latent bind race); corrected playbook applied to both. teelia uses Tailscale SSH (no sshd, immune). All Ubuntu hosts now on the dependency-free-socket + ssh.service-gate pattern.	2026-06-07 06:22:35 -04:00
MajorLinux	8d4dee5da3	troubleshooting: correct ssh tailscale-race article (Fedora ListenAddress variant + playbook cycle landmine) - Fedora hosts are NOT automatically immune: a leftover manual `ListenAddress <tailscale-ip>` drop-in reintroduces the sshd boot bind-race even under firewalld (hit on majordiscord 2026-06-07; fix = remove it). - The Ubuntu playbook kept shipping the cycle-causing [Unit] gate on ssh.socket despite the 2026-06-04 resolution; re-running it re-armed the ordering cycle (clobbered majorlinux; majortoot-hetzner found armed). Corrected in MajorAnsible e0d35aa. Fleet ssh-lockdown state is inconsistent (dcaprod/tttpod lack wait-ready; teelia no override) — needs a per-host audit.	2026-06-07 05:56:25 -04:00
MajorLinux	155651c373	wiki: ssh.socket wait-ready gate + mastodon post-install hardening Two related additions covering the 2026-05-31 cutover-night incidents on majorlinux and majortoot-hetzner. ssh-socket-tailscale-race-condition.md (update Race 1 fix): - After=tailscaled.service Requires=tailscaled.service orders against the service becoming active, not against tailscale0 having an IPv4 — hosts kept losing SSH intermittently after reboots (incident: majorlinux + majortoot-hetzner 2026-05-31, during cutover-night Ansible reboot). - Canonical fix: a oneshot tailscale-wait-ready.service that polls `ip -4 -o addr show tailscale0` until an address is present, with ssh.socket After=/Requires= that service. Document the full evolution (2026-05-19 BindsTo → 2026-05-23 Requires → 2026-05-31 wait-ready) so future readers don't try the half-fixes thinking they're sufficient. - Add majortoot-hetzner to affected hosts. mastodon-post-install-hardening.md (new): Four upstream-install gaps that bit during the majortoot-hetzner cutover: 1. /home/mastodon at 0750 (useradd default) → nginx www-data can't traverse → every static asset 403s → unstyled "purple screen" in the browser while API/HTML still work through the puma proxy. 2. .env.production at 0644 (mastodon-setup default) → DB_PASS, SECRET_KEY_BASE, OTP_SECRET world-readable once gap (1) is fixed. 3. mastodon user shell at /usr/sbin/nologin → `su - mastodon` blocked. 4. rbenv init in .bashrc only → login shells don't source .bashrc; even when chained, Ubuntu's .bashrc returns early for non-interactive shells. Fix: .bash_profile sets up rbenv BEFORE sourcing .profile + .bashrc, so it works for both interactive and non-interactive logins. All four codified in MajorAnsible configure_mastodon_permissions.yml with self-asserting verification steps. 02-selfhosting/index.md + SUMMARY.md: Add a "Services" section to the selfhosting index linking the mastodon-post-install-hardening article (and the other orphaned services/ entries while there). SUMMARY.md gains one new entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-31 11:08:24 -04:00
majorlinux	3b8c8b0597	ssh.socket wiki: correct BindsTo→Requires, add warning BindsTo=tailscaled.service causes a systemd ordering cycle that prevents ssh.socket from starting on reboot. Updated the recommended fix to use Requires= and added a warning admonition explaining why BindsTo must not be used. Added tttpod-hetzner to affected hosts list and linked the 2026-05-23 dcaprod incident.	2026-05-23 02:40:04 -04:00
majorlinux	65b0aa4567	wiki: expand Tailscale race condition article with network-online race Added Race 2: tailscaled starts before network-online.target, causing Tailscale to get stuck with SetNetworkUp(false). Covers both Ubuntu ssh.socket and cross-platform tailscaled ordering issues. Updated references to include majordiscord incident and new Ansible playbook.	2026-05-19 20:39:18 -04:00
majorlinux	7dc591d257	wiki: add ssh.socket Tailscale race condition troubleshooting article Documents the systemd socket activation race where ssh.socket binds to the Tailscale IP before tailscaled is ready, causing SSH to become unreachable after a Tailscale reconnect. Includes diagnosis steps and the After=/BindsTo= fix.	2026-05-19 19:35:16 -04:00

7 commits