Runbook for telling broadcast fundraising solicitation from genuine
mentions: signal checklist, SQL to investigate the account and its
origin instance via nodeinfo, BlockService snippet, and a proportionate
escalation ladder (mute -> block -> report -> domain-limit -> domain-block).
Registered in SUMMARY.md and the self-hosting section index.
Covers the WP 6.7 doing_it_wrong notice fired when a theme/plugin
translates before init (e.g. nav-menu labels on after_setup_theme).
Documents source fix (defer to init) and the update-safe mu-plugin
suppression via doing_it_wrong_trigger_error, plus the renamed-theme
domain gotcha. Derived from the majorlinux.com kappa/marstheme triage.
Client-side fix for OG Steam Deck (RTL8822CE/rtw88) flapping ~once a minute on
SteamOS: disable IWD periodic scan + disable Wi-Fi power save via NM dispatcher.
Cross-linked with the 160MHz airtime article; registered in SUMMARY.md nav.
New 05-troubleshooting/networking article covering the per-host nature of
authorized_keys: rotating a workstation SSH key requires backfilling the new
pubkey to every host, or hosts holding only the old key reject it with
Permission denied (publickey). Includes fleet-sweep diagnosis, idempotent
backed-up backfill via a still-trusted transit user, and prevention. Wired
into SUMMARY.md nav.
New troubleshooting article for the recurring 'security wants to access
Claude Code-credentials' prompt that persists even after Always Allow
(ACL invalidation on binary-signature change / token refresh / post-boot
churn). Covers triage, the reset-and-relogin fix, and the file-based
credentials workaround with its plaintext tradeoff. Registered in
SUMMARY + troubleshooting index; cross-linked with the corrupt-credential
login-failure article (distinct symptom).
Covers enabling the [mailer] for password recovery (relay via a tailnet mail
server, no-auth/mynetworks, FORCE_TRUST_SERVER_CERT for IP targets), CLI password
reset + the must-change-password=true gotcha, adding an SSH key via the basic-auth
API when locked out, and ruling out a server-side cause for a 'changing' password.
New troubleshooting runbook for Logwatch reports titled with the Hetzner
provisioning label instead of the real hostname; cross-linked from the
logwatch fleet-setup and VPS migration baseline articles, plus a new
'set system hostname' step in the post-migration checklist.
Documents the Ansible-by-IP known_hosts gap: interactive ssh works (key
stored under hostname) but Ansible connects by inventory IP and fails with
UNREACHABLE/Host key verification failed. Includes tailnet-safe ssh-keyscan
fix and prevention notes. Surfaced by the Hetzner migration IP churn.
Documents why WSL2 hosts fail an Ansible reboot play at privilege
escalation (Timeout waiting for privilege escalation prompt) — WSL2 has
no real reboot semantics + become stalls over the Windows OpenSSH->WSL2
bridge — and the fix: scope reboot.yml to hosts: all:!wsl. Registered
in SUMMARY.md and 05-troubleshooting/index.md.
New troubleshooting/networking article covering the three SSH failure modes
after a fleet migration (stale hardcoded IP, Tailscale 1.98.x cold-path
teardown, rebuilt-box host-key mismatch) and the durable fix (MagicDNS names +
known_hosts purge + ConnectTimeout), with the WSL2 no-resolver caveat.
Cross-links the existing host-key article (adds a 'when pinning the IP is
wrong' callout) and adds the SUMMARY nav entry.
New 05-troubleshooting/networking article covering the case where ssh <alias>
fails host-key verification because no Host block exists and the alias resolves
via Tailscale MagicDNS to a name with no known_hosts entry (key stored under the
IP). Registered in SUMMARY.md and the troubleshooting index.
Document the majormail 2026-06-07 incident: when userdb home == maildir
root, the LDA/Sieve duplicate database (.dovecot.lda-dupes + .locks) lands
inside the mail store and the maildir lister exposes it as phantom
mailboxes ("dovecot.lda-dupes"), logging stat(.../tmp) "Not a directory".
Fix: point home at a non-dotted subdir. Wired into the troubleshooting
index and SUMMARY.
Document the majormail spam-routing failure (2026-06-06): a cleanup
header_checks REDIRECT keyed on the milter-added X-Spam-Flag never fired for
real inbound mail (only locally-injected), so spam kept reaching the inbox.
Fix is to route in Sieve at delivery (after the milter), with a redirect +
loop guard. Includes the 'local-injection tests lie' warning.
logwatch's clam-update counts only 'process started' lines (emitted only at
daemon restart), so daemon-mode freshclam false-alarms on quiet days despite
signatures updating. Fix: $ignore_no_updates=1 drop-in. Includes the
real-vs-false check (a daemonless box with freshclam disabled is a TRUE alert).
New page documenting the majormail (2026-06-05) issue: /etc/localtime
shipped labeled etc_t instead of locale_t on the Hetzner image, so SELinux
denied systemd-timedated and timedatectl/community.general.timezone reported
success while the symlink stayed at UTC. Fix: restorecon before setting TZ.
Indexed in index.md (SELinux) + SUMMARY.md.
- New page: Dovecot IMAP vsz_limit OOM from a bloated/corrupt index.log
(152M index on an empty folder killed IMAP children with error 83).
- fail2ban IMAP self-ban: add permanent ignoreip-whitelist fix + dynamic-IP caveat.
- firewalld mail ports: add 'submission/587 never added' variant + correct
Fedora service name; note Ansible now manages the full mail-service set.
- Index + SUMMARY updated with the new page.
New 02-selfhosting/services article: the full Postfix/Dovecot inbound spam stack
on Fedora — spamass-milter tag-only wiring (the -r footgun), socket permissions
(sa-milt group + UMask), site-wide Bayes DB, Sieve-to-Junk, and sa-learn training
(folders, spam/ham balance, manual-not-cron). From the majormail setup.
Also extends selinux-dovecot-vmail-context with a Permissive-mode variant + a
postfix_cleanup->mysqld_etc companion-denial note. SUMMARY.md nav updated.
New article mastodon-s3-acl-upload-failures.md: a BucketOwnerEnforced S3
bucket plus a stale S3_PERMISSION/S3_ACL in .env.production makes every
Mastodon upload fail with AccessControlListNotSupported, silently. Covers
symptoms (incl. why a missing object returns 403 not 404), diagnosis,
the fix (S3_PERMISSION= empty, public read via bucket policy), recovery,
a synthetic-write health check, and Ansible enforcement.
Extend mastodon-prune-profiles-trap.md: add a "Bulk restore at scale"
procedure (list existing keys, null missing DB refs, enqueue
RedownloadAvatar/HeaderWorker), a "storage-level deletion without DB
de-ref" section, and a stronger recommendation to disable automated
profile pruning (and scheduled accounts refresh --all) entirely.
Link both from SUMMARY.md and the selfhosting index.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two related additions covering the 2026-05-31 cutover-night incidents on
majorlinux and majortoot-hetzner.
ssh-socket-tailscale-race-condition.md (update Race 1 fix):
- After=tailscaled.service Requires=tailscaled.service orders against the
service becoming active, not against tailscale0 having an IPv4 — hosts
kept losing SSH intermittently after reboots (incident: majorlinux +
majortoot-hetzner 2026-05-31, during cutover-night Ansible reboot).
- Canonical fix: a oneshot tailscale-wait-ready.service that polls
`ip -4 -o addr show tailscale0` until an address is present, with
ssh.socket After=/Requires= that service. Document the full evolution
(2026-05-19 BindsTo → 2026-05-23 Requires → 2026-05-31 wait-ready) so
future readers don't try the half-fixes thinking they're sufficient.
- Add majortoot-hetzner to affected hosts.
mastodon-post-install-hardening.md (new):
Four upstream-install gaps that bit during the majortoot-hetzner cutover:
1. /home/mastodon at 0750 (useradd default) → nginx www-data can't
traverse → every static asset 403s → unstyled "purple screen" in the
browser while API/HTML still work through the puma proxy.
2. .env.production at 0644 (mastodon-setup default) → DB_PASS,
SECRET_KEY_BASE, OTP_SECRET world-readable once gap (1) is fixed.
3. mastodon user shell at /usr/sbin/nologin → `su - mastodon` blocked.
4. rbenv init in .bashrc only → login shells don't source .bashrc; even
when chained, Ubuntu's .bashrc returns early for non-interactive
shells. Fix: .bash_profile sets up rbenv BEFORE sourcing .profile +
.bashrc, so it works for both interactive and non-interactive logins.
All four codified in MajorAnsible configure_mastodon_permissions.yml
with self-asserting verification steps.
02-selfhosting/index.md + SUMMARY.md:
Add a "Services" section to the selfhosting index linking the
mastodon-post-install-hardening article (and the other orphaned
services/ entries while there). SUMMARY.md gains one new entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the systemd socket activation race where ssh.socket binds
to the Tailscale IP before tailscaled is ready, causing SSH to become
unreachable after a Tailscale reconnect. Includes diagnosis steps and
the After=/BindsTo= fix.
New article documenting missing /etc/pki/tls/certs/ca-bundle.crt symlink
on Hetzner Fedora images breaking Postfix TLS, curl, and dnf. Updated
VPS migration baseline checklist with timezone, CA bundle, and crond
verification steps. Updated logwatch fleet setup with crond check.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fail2ban-digest-mode-fleet: recidive-only email model, sshd now silent,
defaults-debian.conf gotcha added
- netdata-docker-health-alarm-tuning: 30m/10m config, tuning history table
- New: wp-fail2ban-logpath-debian-ubuntu, lora-adapter-gguf-conversion-fails,
tailscale-status-json-hostname-localhost-ios
- Various article updates and nav index refreshes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two articles surfaced during the v8 deploy + eval on 2026-04-25:
- Ollama: `ollama run` with piped stdin bypasses the chat template and
SYSTEM prompt — output looks like raw base-model completion. Caught
during initial v8 smoke test. Fix: use /api/chat HTTP endpoint.
- rsync over Tailscale can hang in TCP teardown after the data has
fully transferred. Verify with md5sum, then kill the hung pipeline.
Includes a watcher-threshold gotcha (set below true file size, not
above) and prevention tips.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers creating groups, assigning clients, scoping allow rules to
specific groups via API and CLI. Includes ghost attribution gotcha
(router DNS proxy + secondary DNS causes FTL cache mis-attribution)
and the fix (Pi-hole as sole DNS, remove secondary).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New article covering the conversion from per-ban email alerts to a
three-tier model (silent default, sshd/recidive immediate, daily digest).
Includes Ansible automation, gotchas with lineinfile regex collisions,
and fq-hostname override for clean subjects.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Articles from prior sessions that were written locally but never shipped:
- 02-selfhosting/cloud/aws-s3-cost-management.md — lifecycle rules, storage class selection, bucket inventory, unexpected-growth investigation
- 02-selfhosting/dns-networking/wake-on-lan-router-ssh.md — WOL magic packets via Asus router SSH + ether-wake, Ansible vault integration
- 02-selfhosting/services/claude-code-remote-control.md — mobile access to a persistent host Claude Code session
Nav updated (index.md + SUMMARY.md):
- Added Cloud subsection under Self-Hosting for aws-s3
- Added wake-on-lan and aws-s3 entries to SUMMARY
- Added claude-code-remote-control to index's Services section
- Added ansible-ssh-host-alias-bypass nav entry (article shipped in 2dbeb22)
- Article count 87 → 89, self-hosting 30 → 32, troubleshooting 33 → 34
Documents the 2026-04-09 scanner incident where 301-redirected PHP probes
bypassed the existing apache-404scan jail, leaving the scanner unbanned
and firing Netdata web_log_1m_redirects alerts. New jail catches 301/302/
403/404 PHP responses while excluding legitimate WordPress endpoints.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Captures the majorlab incident where the backup watchdog emailed a missing
heartbeat after a kernel-update reboot wiped /var/run, even though the
backup had actually completed cleanly. Documents the tmpfs root cause and
the fix of storing heartbeats under /var/lib instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers shell quoting for URLs containing &, ?, #, and other characters
that Bash interprets as operators. Common gotcha when downloading from
CDNs with token-based URLs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Corrected inflated article count (was 76, actual is 73).
Updated domain breakdown and frontmatter timestamps from Obsidian.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Access-log-based filter for wp-login.php brute force detection without
requiring the WP fail2ban plugin. Documents the backend=polling gotcha
on Ubuntu 24.04 and manual banning workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>