From 06a794316b19d63fecb6c3821579febb88c1ae85 Mon Sep 17 00:00:00 2001 From: Marcus Summers Date: Thu, 11 Jun 2026 11:33:42 -0400 Subject: [PATCH] docs: point Ansible references at the new roles (clamav/ssh_hardening/tailscale) Operational/how-to references updated to the role entry playbooks after the ADR-0001 migration. Historical incident narrative (dated callouts, commit refs) preserved. - clamav-fleet-deployment: override + re-run -> clamav.yml; role note - ssh-hardening-ansible-fleet: note this is now the ssh_hardening role - vps-migration-baseline-checklist: table -> clamav.yml / ssh_hardening.yml - ssh-socket-tailscale-race-condition: Affected Hosts + Prevention + References -> tailscale role tasks (network_wait/ssh_only_ubuntu/ssh_only_fedora) - freshclam-logwatch-false-no-updates: codify refs -> clamav role --- .../cloud/vps-migration-baseline-checklist.md | 4 ++-- 02-selfhosting/security/clamav-fleet-deployment.md | 10 +++++++--- .../security/ssh-hardening-ansible-fleet.md | 3 +++ .../networking/ssh-socket-tailscale-race-condition.md | 11 ++++++----- .../security/freshclam-logwatch-false-no-updates.md | 4 ++-- 5 files changed, 20 insertions(+), 12 deletions(-) diff --git a/02-selfhosting/cloud/vps-migration-baseline-checklist.md b/02-selfhosting/cloud/vps-migration-baseline-checklist.md index 3062d3e..d1ac251 100644 --- a/02-selfhosting/cloud/vps-migration-baseline-checklist.md +++ b/02-selfhosting/cloud/vps-migration-baseline-checklist.md @@ -57,8 +57,8 @@ Every server in the fleet should have these. Check each one after migration: | Firewall | `firewalld` | `ufw` | `configure_firewall_*.yml` | Verify fail2ban banaction matches | | Cron | `cronie` | `cron` | — (usually pre-installed) | Required by logwatch | | Auto-updates | `dnf-automatic` | `unattended-upgrades` | `ansible-unattended-upgrades-fleet` | Security patches only | -| Antivirus | `clamav` | `clamav` | `configure_clamav.yml` | Internet-facing hosts only | -| SSH hardening | `openssh-server` | `openssh-server` | `configure_ssh_hardening.yml` | Key-only, no root password | +| Antivirus | `clamav` | `clamav` | `clamav.yml` (clamav role) | Internet-facing hosts only | +| SSH hardening | `openssh-server` | `openssh-server` | `ssh_hardening.yml` (ssh_hardening role) | Key-only, no root password | | Timezone | — | — | — | US servers: `America/New_York`; UK: `Europe/London`. Hetzner defaults to UTC. | | CA bundle (Fedora) | `ca-certificates` | `ca-certificates` | — | Verify `/etc/pki/tls/certs/ca-bundle.crt` symlink exists — see [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md) | | Syslog (Fedora) | `rsyslog` | — (pre-installed) | — | Fedora 44 Hetzner images have journald only. Logwatch needs `/var/log/messages` + `/var/log/secure`. | diff --git a/02-selfhosting/security/clamav-fleet-deployment.md b/02-selfhosting/security/clamav-fleet-deployment.md index f4a2888..553d783 100644 --- a/02-selfhosting/security/clamav-fleet-deployment.md +++ b/02-selfhosting/security/clamav-fleet-deployment.md @@ -31,6 +31,10 @@ ClamAV is the standard open-source antivirus for Linux servers. For internet-fac ## Ansible Playbook +> On the MajorsHouse fleet this is packaged as the **`clamav` role** (`roles/clamav/`, +> tasks split install → service → scan → verify) and run via `clamav.yml` or `site.yml`. +> The standalone playbook below is the illustrative equivalent. + ```yaml - name: Deploy ClamAV to internet-facing hosts hosts: internet_facing # dca, majorlinux, teelia, tttpod, majortoot, majormail @@ -240,16 +244,16 @@ On hosts with ≤2 GB RAM, running `clamd` continuously is often counterproducti **The fix: `clamav_use_daemon: false` host_var** -`configure_clamav.yml` supports a per-host override. Add to the host's `host_vars//vars.yml`: +The `clamav` role supports a per-host override. Add to the host's `host_vars//vars.yml`: ```yaml clamav_use_daemon: false ``` -Then re-run the playbook: +Then re-run the role: ```bash -ansible-playbook configure_clamav.yml --limit +ansible-playbook clamav.yml --limit ``` This will: diff --git a/02-selfhosting/security/ssh-hardening-ansible-fleet.md b/02-selfhosting/security/ssh-hardening-ansible-fleet.md index 2f32a71..a3f2801 100644 --- a/02-selfhosting/security/ssh-hardening-ansible-fleet.md +++ b/02-selfhosting/security/ssh-hardening-ansible-fleet.md @@ -31,6 +31,9 @@ Rather than editing `/etc/ssh/sshd_config` directly (which may be managed by the ## Ansible Playbook +> On the MajorsHouse fleet this is packaged as the **`ssh_hardening` role** (`roles/ssh_hardening/`) +> and run via `ssh_hardening.yml` or `site.yml`. The standalone playbook below is the illustrative equivalent. + ```yaml - name: Harden SSH daemon fleet-wide hosts: all:!raspbian diff --git a/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md b/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md index 72ba1bd..5d16555 100644 --- a/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md +++ b/05-troubleshooting/networking/ssh-socket-tailscale-race-condition.md @@ -81,7 +81,7 @@ ss -tlnp | grep :22 # verify bound to Tailscale IP ### Affected Hosts -Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. +Ubuntu hosts locked via the `tailscale` role (`ssh_only_ubuntu` task, formerly `configure_tailscale_ssh_only.yml`): majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. > [!danger] The Ubuntu playbook shipped the cycle pattern until 2026-06-07 > Despite the 2026-06-04 resolution above, `configure_tailscale_ssh_only.yml` in the repo kept deploying the `[Unit] Requires=tailscale-wait-ready.service` gate on **ssh.socket** (the cycle-causer) and never added the ssh.service gate — so re-running it *re-armed* the ordering cycle. Caught 2026-06-07: it clobbered majorlinux's hand-fix, and **majortoot-hetzner was found already armed** with the latent cycle (would have lost SSH on its next reboot). Both restored/defused; playbook corrected in MajorAnsible `e0d35aa` (gate on ssh.service, dependency-free socket). @@ -141,9 +141,10 @@ All hosts where Tailscale is the primary access path. Particularly impactful on ## Prevention - Set root passwords on all VPS hosts for emergency console access -- Ansible playbooks deploy both fixes automatically: - - `configure_tailscale_network_wait.yml` — tailscaled network-online dependency (all hosts) - - `configure_tailscale_ssh_only.yml` — ssh.socket Tailscale dependency (Ubuntu only) +- The `tailscale` role deploys all fixes automatically (run via `tailscale.yml` / `site.yml`): + - `network_wait` task — tailscaled network-online dependency (all hosts) + - `ssh_only_ubuntu` task — dependency-free ssh.socket bind + ssh.service readiness gate + `tailscale-wait-ready.service` (Ubuntu group) + - `ssh_only_fedora` task — firewalld Tailscale-only lockdown; removes any leftover `ListenAddress` drop-in (Fedora group) ## References @@ -153,4 +154,4 @@ All hosts where Tailscale is the primary access path. Particularly impactful on - [[dcaprod#2026-05-23 — SSH unreachable again: BindsTo ordering cycle in ssh.socket override]] - [[majorlinux#2026-05-31 — ssh.socket race recurrence post-reboot (Requires= insufficient; added wait-ready gate)]] - [[majortoot#2026-05-31 — ssh.socket race post-reboot on majortoot-hetzner (during cutover night)]] -- Ansible: `configure_tailscale_ssh_only.yml`, `configure_tailscale_network_wait.yml` +- Ansible: the `tailscale` role (`tailscale.yml`) — `network_wait` + `ssh_only_ubuntu`/`ssh_only_fedora` tasks, consolidated from the former `configure_tailscale_*` playbooks (MajorAnsible `656302e`) diff --git a/05-troubleshooting/security/freshclam-logwatch-false-no-updates.md b/05-troubleshooting/security/freshclam-logwatch-false-no-updates.md index 2d3629f..85dba94 100644 --- a/05-troubleshooting/security/freshclam-logwatch-false-no-updates.md +++ b/05-troubleshooting/security/freshclam-logwatch-false-no-updates.md @@ -47,7 +47,7 @@ No service restart needed; logwatch picks it up on its next daily run. (The vari ## Codify (Ansible) -Deploy the drop-in wherever freshclam runs in daemon mode. On the fleet it's a task in `configure_clamav.yml` (group `clamav`), right after freshclam is enabled — MajorAnsible commit `cb27c93`: +Deploy the drop-in wherever freshclam runs in daemon mode. On the fleet it's a task in the `clamav` role (`roles/clamav/tasks/install.yml`, group `clamav`), right after freshclam is enabled — originally added in MajorAnsible commit `cb27c93`: ```yaml - name: Suppress logwatch clam-update false "no updates" alert (daemon-mode freshclam) @@ -67,7 +67,7 @@ Deploy the drop-in wherever freshclam runs in daemon mode. On the fleet it's a t ## Proactive monitoring (don't rely on logwatch for "is it updating?") -Since logwatch's heuristic is suppressed, a **direct daily watchdog** is what actually catches a dead freshclam. `configure_clamav.yml` deploys `/etc/cron.daily/clamav-freshness` (MajorAnsible `9d1a1a9`) to every `clamav`-group host: it emails the admin (via `sendmail`) if `clamav-freshclam` is inactive **or** `daily.cld` is older than `clamav_staleness_threshold_days` (default 3) — and stays silent otherwise. Test without emailing: +Since logwatch's heuristic is suppressed, a **direct daily watchdog** is what actually catches a dead freshclam. The `clamav` role deploys `/etc/cron.daily/clamav-freshness` (originally MajorAnsible `9d1a1a9`) to every `clamav`-group host: it emails the admin (via `sendmail`) if `clamav-freshclam` is inactive **or** `daily.cld` is older than `clamav_staleness_threshold_days` (default 3) — and stays silent otherwise. Test without emailing: ```bash CLAMAV_STALE_DAYS=0 /etc/cron.daily/clamav-freshness # forces the stale branch