Documents three more patterns surfaced in the 2026-05-10 fleet-mail
investigation, all hitting hosts derived from cloud images or
cross-provider migrations:
- Packer/snapshot-leftover myhostname (postfix EHLO + message-id
identifies the build artifact, not the production hostname; remote
spam scorers hate it)
- Empty relayhost silently routes mail via the public MX instead of
the Tailscale-internal path, exposing it to spamchk that internal
traffic bypasses
- Stale SASL passwd map referencing a missing file from a previous
external-SMTP relay setup, deferring every send with "local data
error"
Each looks benign in isolation. Together they made dcaprod's Logwatch
disappear into spamchk for weeks while showing 250 OK on the source.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents three lessons from the 2026-05-10 fleet outage where the
Fedora half (majorhome, majorlab) had been silently failing to send
notification mail for days:
- Missing /etc/pki/tls/certs/ca-bundle.crt symlink (extracted bundle
exists at /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem but the
consumer-path symlink was lost during a ca-certificates package
event). Diagnosis includes the cross-tool tell — dnf and curl break
with the same path. Fix is a single ln -sfn.
- Methodology: Fedora and majormail log postfix to journald; Debian and
Ubuntu log to /var/log/mail.log. Querying the wrong source returns
false negatives for healthy hosts.
- Bounce-source addresses (Watchtower NOTIFICATION_EMAIL_FROM,
fail2ban sender, root@<host>.localdomain) must resolve to real
mailboxes — otherwise the first failed delivery generates
bounce-of-bounce churn.
Also promoting the article from untracked to committed; it had been
authored on 2026-05-09 and not yet added to the repo.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>