DO's hypervisor-level CPU metric doesn't know about nice/ionice — a
"polite" weekly clamscan on a 1 vCPU droplet still reads 100% utilization
and trips a default >85%/5m alert. Adds a new section explaining the
trade-off and providing the DO API recipe (PUT existing alert with
explicit entities, POST a new relaxed alert scoped to the small
droplet) plus when not to bother (2+ vCPU boxes won't trip).
Triggered by the 2026-05-10 teelia incident where the weekly cron fired
the fleet-wide CPU alert despite the cron script already wrapping
clamscan in nice 19 + ionice idle + cgroup memory limits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the long-standing UX regression caused by
`tootctl media remove --prune-profiles` (and `--remove-headers`)
running on a schedule: cached remote avatars are deleted, but
Mastodon does not auto-refetch on profile view, so quiet remote
accounts stay broken indefinitely.
Article covers:
- The mutually-exclusive flag bug (silent skip if combined)
- Mastodon's actual avatar-refresh trigger model (Update activities,
not profile views)
- A `refresh-my-follows.sh` pattern with a defensible WHERE clause
(avatar NULL AND avatar_remote_url present) to avoid infinite
retry on accounts whose origin has no avatar
- Why header_file_name IS NULL is a bad signal (~20% of users
legitimately have no custom header)
- The cron decision: most admins should drop --prune-profiles
- fail2ban-digest-mode-fleet: recidive-only email model, sshd now silent,
defaults-debian.conf gotcha added
- netdata-docker-health-alarm-tuning: 30m/10m config, tuning history table
- New: wp-fail2ban-logpath-debian-ubuntu, lora-adapter-gguf-conversion-fails,
tailscale-status-json-hostname-localhost-ios
- Various article updates and nav index refreshes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers creating groups, assigning clients, scoping allow rules to
specific groups via API and CLI. Includes ghost attribution gotcha
(router DNS proxy + secondary DNS causes FTL cache mis-attribution)
and the fix (Pi-hole as sole DNS, remove secondary).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New article covering the conversion from per-ban email alerts to a
three-tier model (silent default, sshd/recidive immediate, daily digest).
Includes Ansible automation, gotchas with lineinfile regex collisions,
and fq-hostname override for clean subjects.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Articles from prior sessions that were written locally but never shipped:
- 02-selfhosting/cloud/aws-s3-cost-management.md — lifecycle rules, storage class selection, bucket inventory, unexpected-growth investigation
- 02-selfhosting/dns-networking/wake-on-lan-router-ssh.md — WOL magic packets via Asus router SSH + ether-wake, Ansible vault integration
- 02-selfhosting/services/claude-code-remote-control.md — mobile access to a persistent host Claude Code session
Nav updated (index.md + SUMMARY.md):
- Added Cloud subsection under Self-Hosting for aws-s3
- Added wake-on-lan and aws-s3 entries to SUMMARY
- Added claude-code-remote-control to index's Services section
- Added ansible-ssh-host-alias-bypass nav entry (article shipped in 2dbeb22)
- Article count 87 → 89, self-hosting 30 → 32, troubleshooting 33 → 34
Documents the 2026-04-09 scanner incident where 301-redirected PHP probes
bypassed the existing apache-404scan jail, leaving the scanner unbanned
and firing Netdata web_log_1m_redirects alerts. New jail catches 301/302/
403/404 PHP responses while excluding legitimate WordPress endpoints.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a section documenting how missing HTTP/HTTPS rules caused a
site outage on tttpod, and updates the fleet reference table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Article count 73 → 74. Added to SUMMARY.md, index.md, README.md,
and 02-selfhosting/index.md (which was also missing 5 other security
articles from prior sessions).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Access-log-based filter for wp-login.php brute force detection without
requiring the WP fail2ban plugin. Documents the backend=polling gotcha
on Ubuntu 24.04 and manual banning workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fixed 4 broken markdown links (bad relative paths in See Also sections)
- Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment)
- Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3)
- Converted 67 Obsidian wikilinks to relative markdown links or plain text
- Added YAML frontmatter to 35 articles missing it entirely
- Completed frontmatter on 8 articles with missing fields
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New article: selinux-fail2ban-execmem-fix.md — custom policy module
for fail2ban grep execmem denial on Fedora 43.
Also includes previously uncommitted:
- n8n-proxy-trust-x-forwarded-for.md
- fail2ban-apache-404-scanner-jail.md updates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Global backend=systemd in jail.local silently breaks file-based jails.
Added required backend=polling to config, diagnostic command, and warning.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers version checking, pinned-tag update process, SQLite password
reset, and why Arcane may not catch updates when the latest tag lags
behind npm releases.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New guide for custom access-log-based fail2ban jail that catches
rapid-fire 404 vulnerability scanners missed by default error-log jails.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add troubleshooting article for the 2026-03-27 incident where PHP-FPM
hung after the nightly update cycle. Update the Netdata Docker alarm
tuning article with the dedicated Nextcloud alarm split and the new
watchdog cron deployed to majorlab. (54 articles)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>