New 05-troubleshooting/networking article covering the per-host nature of
authorized_keys: rotating a workstation SSH key requires backfilling the new
pubkey to every host, or hosts holding only the old key reject it with
Permission denied (publickey). Includes fleet-sweep diagnosis, idempotent
backed-up backfill via a still-trusted transit user, and prevention. Wired
into SUMMARY.md nav.
Add a Maintenance subsection covering why 'yt-dlp -U' fails on PyPI
builds and how to update via pip, plus how to detect/remove a duplicate
user+system install (the issue hit on majorhome 2026-06-16).
New troubleshooting article for the recurring 'security wants to access
Claude Code-credentials' prompt that persists even after Always Allow
(ACL invalidation on binary-signature change / token refresh / post-boot
churn). Covers triage, the reset-and-relogin fix, and the file-based
credentials workaround with its plaintext tradeoff. Registered in
SUMMARY + troubleshooting index; cross-linked with the corrupt-credential
login-failure article (distinct symptom).
2026-06-15: mirroring is reproducibly stuck on Connecting again with
Tailscale accept-routes still off, so the 06-14 it-works conclusion was
wrong. _asquic endpoint resolves but the QUIC/AWDL datapath never
completes; awdl0 bounce, full reboot, and phone radio cycle all failed.
Reframed as an intermittent macOS 27.0 beta AWDL bug; QuickTime USB
remains the workaround.
The postfix MX-lookup example hard-coded majormail's real public IP
(stale DO address). Swap in an RFC 5737 documentation IP (203.0.113.10)
so the published wiki doesn't expose a real fleet IP.
majormail (2026-06-14) had the correct system hostname but still mailed
from majormail-hetzner — the old provisioning label was hardcoded in
logwatch.conf MailFrom and fail2ban jail.local sender. Add a variant
section covering the config grep sweep and the templated-vs-static
Ansible regression caveat.
Covers enabling the [mailer] for password recovery (relay via a tailnet mail
server, no-auth/mynetworks, FORCE_TRUST_SERVER_CERT for IP targets), CLI password
reset + the must-change-password=true gotcha, adding an SSH key via the basic-auth
API when locked out, and ruling out a server-side cause for a 'changing' password.
New troubleshooting runbook for Logwatch reports titled with the Hetzner
provisioning label instead of the real hostname; cross-linked from the
logwatch fleet-setup and VPS migration baseline articles, plus a new
'set system hostname' step in the post-migration checklist.
Documents the Ansible-by-IP known_hosts gap: interactive ssh works (key
stored under hostname) but Ansible connects by inventory IP and fails with
UNREACHABLE/Host key verification failed. Includes tailnet-safe ssh-keyscan
fix and prevention notes. Surfaced by the Hetzner migration IP churn.
Documents why WSL2 hosts fail an Ansible reboot play at privilege
escalation (Timeout waiting for privilege escalation prompt) — WSL2 has
no real reboot semantics + become stalls over the Windows OpenSSH->WSL2
bridge — and the fix: scope reboot.yml to hosts: all:!wsl. Registered
in SUMMARY.md and 05-troubleshooting/index.md.
New troubleshooting/networking article covering the three SSH failure modes
after a fleet migration (stale hardcoded IP, Tailscale 1.98.x cold-path
teardown, rebuilt-box host-key mismatch) and the durable fix (MagicDNS names +
known_hosts purge + ConnectTimeout), with the WSL2 no-resolver caveat.
Cross-links the existing host-key article (adds a 'when pinning the IP is
wrong' callout) and adds the SUMMARY nav entry.
Document that VAAPI HEVC on Polaris can't beat already-efficient H.264 (YouTube/
Twitch/stream archives), so output comes out larger and lands in hevc_failed.txt.
Add already_failed() guard so the batch skips known-bad files on queue rebuilds
instead of re-attempting them. Also: MIN_FREE_GB note (start-only check) and a
source-bitrate triage snippet for picking real encode candidates.
New 05-troubleshooting/networking article covering the case where ssh <alias>
fails host-key verification because no Host block exists and the alias resolves
via Tailscale MagicDNS to a name with no known_hosts entry (key stored under the
IP). Registered in SUMMARY.md and the troubleshooting index.
Backing up two unpublished draft articles that existed only in a working-tree
stash. Drafts — NOT in SUMMARY.md nav and NOT merged to main, so not published
to notes.majorshouse.com. Pre-commit nav check bypassed intentionally (--no-verify).
- 05-troubleshooting/claude-code-warp-login-corrupt-keychain-credential.md
- 05-troubleshooting/iphone-mirroring-connecting-hang-awdl-stall-beta.md
Operational/how-to references updated to the role entry playbooks after the
ADR-0001 migration. Historical incident narrative (dated callouts, commit
refs) preserved.
- clamav-fleet-deployment: override + re-run -> clamav.yml; role note
- ssh-hardening-ansible-fleet: note this is now the ssh_hardening role
- vps-migration-baseline-checklist: table -> clamav.yml / ssh_hardening.yml
- ssh-socket-tailscale-race-condition: Affected Hosts + Prevention + References
-> tailscale role tasks (network_wait/ssh_only_ubuntu/ssh_only_fedora)
- freshclam-logwatch-false-no-updates: codify refs -> clamav role
dcaprod-hetzner + tttpod-hetzner were missing tailscale-wait-ready.service
(inert ssh.service gate -> latent bind race); corrected playbook applied to
both. teelia uses Tailscale SSH (no sshd, immune). All Ubuntu hosts now on
the dependency-free-socket + ssh.service-gate pattern.
- Fedora hosts are NOT automatically immune: a leftover manual
`ListenAddress <tailscale-ip>` drop-in reintroduces the sshd boot bind-race
even under firewalld (hit on majordiscord 2026-06-07; fix = remove it).
- The Ubuntu playbook kept shipping the cycle-causing [Unit] gate on
ssh.socket despite the 2026-06-04 resolution; re-running it re-armed the
ordering cycle (clobbered majorlinux; majortoot-hetzner found armed).
Corrected in MajorAnsible e0d35aa. Fleet ssh-lockdown state is inconsistent
(dcaprod/tttpod lack wait-ready; teelia no override) — needs a per-host audit.
Document the majormail 2026-06-07 incident: when userdb home == maildir
root, the LDA/Sieve duplicate database (.dovecot.lda-dupes + .locks) lands
inside the mail store and the maildir lister exposes it as phantom
mailboxes ("dovecot.lda-dupes"), logging stat(.../tmp) "Not a directory".
Fix: point home at a non-dotted subdir. Wired into the troubleshooting
index and SUMMARY.
Document the majormail spam-routing failure (2026-06-06): a cleanup
header_checks REDIRECT keyed on the milter-added X-Spam-Flag never fired for
real inbound mail (only locally-injected), so spam kept reaching the inbox.
Fix is to route in Sieve at delivery (after the milter), with a redirect +
loop guard. Includes the 'local-injection tests lie' warning.
Document the daily /etc/cron.daily/clamav-freshness watchdog as the real
detector for stale signatures, and the key gotcha that 'mail' is absent on
most fleet hosts so alert scripts must use /usr/sbin/sendmail -t.
logwatch's clam-update counts only 'process started' lines (emitted only at
daemon restart), so daemon-mode freshclam false-alarms on quiet days despite
signatures updating. Fix: $ignore_no_updates=1 drop-in. Includes the
real-vs-false check (a daemonless box with freshclam disabled is a TRUE alert).
New page documenting the majormail (2026-06-05) issue: /etc/localtime
shipped labeled etc_t instead of locale_t on the Hetzner image, so SELinux
denied systemd-timedated and timedatectl/community.general.timezone reported
success while the symlink stayed at UTC. Fix: restorecon before setting TZ.
Indexed in index.md (SELinux) + SUMMARY.md.
- New page: Dovecot IMAP vsz_limit OOM from a bloated/corrupt index.log
(152M index on an empty folder killed IMAP children with error 83).
- fail2ban IMAP self-ban: add permanent ignoreip-whitelist fix + dynamic-IP caveat.
- firewalld mail ports: add 'submission/587 never added' variant + correct
Fedora service name; note Ansible now manages the full mail-service set.
- Index + SUMMARY updated with the new page.
Three updates to the inbound spam filtering guide, all driven by the 2026-06-04
majormail-hetzner Phase 6 cutover and follow-up tuning:
1. Section 6 (Dovecot Sieve): warn explicitly that `plugin/sieve_before` was
dropped in Pigeonhole 2.4 and silently does nothing — no startup warning,
spam just keeps landing in INBOX. The 2.4 replacement is a top-level
`sieve_script <name> { type = before; path = …; }` block. Also note the
Fedora-flat-dovecot.conf pitfall (some packagings ship dovecot.conf
without `!include conf.d/*.conf`, so the block has to live in the main
file directly). Added a `sievec` compile step.
2. New §6b: route spam to a separate `junk@` mailbox via Postfix cleanup
`header_checks` REDIRECT. This makes spam invisible to the user's
mailbox entirely — Spark/IDLE-based clients don't push-notify because
the message never reaches the subscribed mailbox at all. Includes the
`regexp:` vs `pcre:` map-type tip (use regexp on stock Fedora to avoid
the postfix-pcre package dependency).
3. New §7a: weekly systemd timer for sa-learn. The §7 warning about
"don't run sa-learn from cron unless folders are clean" is correct as
the safe default — but when you adopt the §6b REDIRECT-to-junk@
pattern, the junk@ mailbox is pure spam by design and a weekly
`--spam`/`--ham`/`--sync`/`--force-expire` chain becomes safe and
useful. Full unit templates included.
Gotchas table gains four entries:
- Pigeonhole 2.4 silent breakage of plugin/sieve_before
- postfix-pcre vs regexp map type confusion
- Why sieve fileinto Junk still pushes a Spark notification
- Why local `sendmail` injection doesn't trigger the REDIRECT (smtpd
milters skip sendmail-injected mail, so X-Spam-Flag isn't added)
All changes match what's now codified in the `majormail` Ansible role
(commit 7a8b9eb in MajorAnsible).