Commit graph

6 commits

Author SHA1 Message Date
06a794316b docs: point Ansible references at the new roles (clamav/ssh_hardening/tailscale)
Operational/how-to references updated to the role entry playbooks after the
ADR-0001 migration. Historical incident narrative (dated callouts, commit
refs) preserved.

- clamav-fleet-deployment: override + re-run -> clamav.yml; role note
- ssh-hardening-ansible-fleet: note this is now the ssh_hardening role
- vps-migration-baseline-checklist: table -> clamav.yml / ssh_hardening.yml
- ssh-socket-tailscale-race-condition: Affected Hosts + Prevention + References
  -> tailscale role tasks (network_wait/ssh_only_ubuntu/ssh_only_fedora)
- freshclam-logwatch-false-no-updates: codify refs -> clamav role
2026-06-11 11:33:42 -04:00
64ac418a36 wiki: add ClamAV daemonless mode section + HEVC VAAPI article link 2026-05-15 09:02:24 -04:00
a852f7b7bd ClamAV fleet caveat: add follow-up on the polite-CPU-on-1vCPU edge case
Same-day correction. The proposed per-droplet relaxed alert (>95%/30m)
turned out to also trip on a 1 vCPU box during low-traffic weekly scans,
because there's literally no real load for nice 19 to yield to —
clamscan opportunistically fills the vCPU and DO sees 100% utilization
regardless of `%nice` vs `%user` split. Documents the three realistic
options (accept page / switch to clamdscan / disable alert) and the
underlying limit (no DO threshold can distinguish polite from impolite
CPU when the box is fully utilized).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 02:32:35 -04:00
af14e36caf ClamAV fleet article: add DigitalOcean monitoring caveat for 1vCPU droplets
DO's hypervisor-level CPU metric doesn't know about nice/ionice — a
"polite" weekly clamscan on a 1 vCPU droplet still reads 100% utilization
and trips a default >85%/5m alert. Adds a new section explaining the
trade-off and providing the DO API recipe (PUT existing alert with
explicit entities, POST a new relaxed alert scoped to the small
droplet) plus when not to bother (2+ vCPU boxes won't trip).

Triggered by the 2026-05-10 teelia incident where the weekly cron fired
the fleet-wide CPU alert despite the cron script already wrapping
clamscan in nice 19 + ionice idle + cgroup memory limits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 02:24:17 -04:00
4126656c05 wiki: update fail2ban digest + netdata docker health + 3 new articles
- fail2ban-digest-mode-fleet: recidive-only email model, sshd now silent,
  defaults-debian.conf gotcha added
- netdata-docker-health-alarm-tuning: 30m/10m config, tuning history table
- New: wp-fail2ban-logpath-debian-ubuntu, lora-adapter-gguf-conversion-fails,
  tailscale-status-json-hostname-localhost-ios
- Various article updates and nav index refreshes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-02 14:58:07 -04:00
b40e484aae Add 5 wiki articles from 2026-04-17/18 work
- ghost-smtp-mailgun-setup: two-system email config (newsletter API + transactional SMTP)
- firewalld-fleet-hardening: Fedora fleet firewall audit-and-harden pattern with Ansible
- clamav-fleet-deployment: fleet deployment with nice/ionice throttling + quarantine
- ansible-check-mode-false-positives: when: not ansible_check_mode guard for verify/assert tasks
- ghost-emailanalytics-lag-warning: submitted status, lag counter, fetchMissing skip explained
2026-04-18 11:13:39 -04:00