diff --git a/02-selfhosting/cloud/vps-migration-baseline-checklist.md b/02-selfhosting/cloud/vps-migration-baseline-checklist.md index d1ac251..b8ed1cc 100644 --- a/02-selfhosting/cloud/vps-migration-baseline-checklist.md +++ b/02-selfhosting/cloud/vps-migration-baseline-checklist.md @@ -66,14 +66,15 @@ Every server in the fleet should have these. Check each one after migration: ### After Migration 1. **Set the timezone** — `timedatectl set-timezone America/New_York` (US) or `Europe/London` (UK). Hetzner images default to UTC. -2. **Verify CA bundle (Fedora)** — `ls /etc/pki/tls/certs/ca-bundle.crt`. If missing, Postfix TLS, curl, and dnf will all fail silently. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md). -3. **Run `harden.yml` against the new host** — catches most gaps in one pass -4. **Send a test email** — `echo test | mail -s "test" marcus@majorshouse.com` — if this fails, nothing else can alert you -5. **Verify crond is running** — `systemctl is-active crond` (Fedora) or `systemctl is-active cron` (Ubuntu). cronie can be `enabled` but not `active` after provisioning. -6. **Check Netdata Cloud** — verify the new node appears and alerts are flowing -7. **Compare fail2ban jails** — `fail2ban-client status` on both old and new -8. **Verify logwatch sends** — `sudo logwatch --output mail --range today` -9. **Keep the old box powered off but not destroyed** for at least 7 days after remediation +2. **Set the system hostname** — Hetzner provisions the box as `-hetzner`. Run `hostnamectl set-hostname ` and fix the loopback line: `sed -i "s/127.0.1.1.*/127.0.1.1 /" /etc/hosts`. Skip this and **Logwatch emails arrive titled `Logwatch for -hetzner`** weeks later. Do it alongside the Tailscale node rename and Postfix `myhostname` — all three read from the provisioning label. See [Logwatch wrong hostname after migration](../../05-troubleshooting/logwatch-wrong-hostname-after-migration.md). +3. **Verify CA bundle (Fedora)** — `ls /etc/pki/tls/certs/ca-bundle.crt`. If missing, Postfix TLS, curl, and dnf will all fail silently. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md). +4. **Run `harden.yml` against the new host** — catches most gaps in one pass +5. **Send a test email** — `echo test | mail -s "test" marcus@majorshouse.com` — if this fails, nothing else can alert you +6. **Verify crond is running** — `systemctl is-active crond` (Fedora) or `systemctl is-active cron` (Ubuntu). cronie can be `enabled` but not `active` after provisioning. +7. **Check Netdata Cloud** — verify the new node appears and alerts are flowing +8. **Compare fail2ban jails** — `fail2ban-client status` on both old and new +9. **Verify logwatch sends** — `sudo logwatch --output mail --range today` +10. **Keep the old box powered off but not destroyed** for at least 7 days after remediation ### Using doctl to Manage Old Droplets diff --git a/02-selfhosting/monitoring/logwatch-fleet-setup.md b/02-selfhosting/monitoring/logwatch-fleet-setup.md index a1cd239..15696f1 100644 --- a/02-selfhosting/monitoring/logwatch-fleet-setup.md +++ b/02-selfhosting/monitoring/logwatch-fleet-setup.md @@ -235,6 +235,9 @@ sed -i '/^127\.0\.1\.1/d' /etc/hosts && \ systemctl reload postfix ``` +> [!tip] Same drift, different symptom: the Logwatch **title** +> Hetzner provisions boxes with `-hetzner` as the *system* hostname. When that's never corrected, Logwatch (which reads the live hostname at runtime) mails reports titled `Logwatch for -hetzner` — no postfix involvement needed. Same `hostnamectl set-hostname` + `/etc/hosts` fix as above. See [Logwatch wrong hostname after migration](../../05-troubleshooting/logwatch-wrong-hostname-after-migration.md). + ### 2. Empty `relayhost` quietly forces public-MX delivery If `postconf relayhost` returns an empty value, postfix doesn't fail — it just does an MX lookup for the destination domain and tries to deliver directly. For mail to your own mail server, that means going via the **public MX** (the domain's external MX record, e.g., `mail.majorshouse.com → 165.227.187.191:25`) instead of the **internal/Tailscale relay path** the rest of the fleet uses. diff --git a/05-troubleshooting/logwatch-wrong-hostname-after-migration.md b/05-troubleshooting/logwatch-wrong-hostname-after-migration.md new file mode 100644 index 0000000..c63fcdc --- /dev/null +++ b/05-troubleshooting/logwatch-wrong-hostname-after-migration.md @@ -0,0 +1,111 @@ +--- +title: "Logwatch Reports the Wrong Hostname (`-hetzner`) After a Migration" +domain: troubleshooting +category: monitoring +tags: [logwatch, hostname, hetzner, migration, monitoring, provisioning] +status: published +created: 2026-06-12 +updated: 2026-06-12 +--- + +# Logwatch Reports the Wrong Hostname (`-hetzner`) After a Migration + +## Symptom + +Daily Logwatch emails from a recently migrated server arrive titled with the +provisioning label instead of the real hostname: + +``` +Logwatch for tttpod-hetzner (Linux) +Logwatch for dcaprod-hetzner (Linux) +``` + +Everything else works — the report is generated, mailed, and delivered. Only the +**name in the title is wrong**, which makes reports harder to scan and breaks any +filter or rule that keys on the expected hostname. + +## Cause + +Logwatch titles each report with the box's **live system hostname** +(`hostnamectl --static` / `/etc/hostname`) read at runtime — it does *not* keep +its own copy of the name. + +Hetzner Cloud servers are provisioned with a temporary node label as the system +hostname — `-hetzner` (e.g. `tttpod-hetzner`). The migration runbook renames +the **Tailscale node** back to the bare name and sets Postfix `myhostname`, but the +**OS hostname** itself is easy to miss because nothing surfaces it day to day. It +stays `-hetzner` until something reads `hostname` — Logwatch is usually the +first thing to do so, weeks later. + +Confirm the box is actually mislabelled: + +```bash +ssh root@ 'hostnamectl --static; cat /etc/hostname; grep 127.0.1.1 /etc/hosts' +# static: tttpod-hetzner +# /etc/hostname: tttpod-hetzner +# 127.0.1.1 tttpod-hetzner tttpod-hetzner +``` + +## Fix + +Set the real hostname and fix the matching `/etc/hosts` loopback line: + +```bash +ssh root@ ' + hostnamectl set-hostname + sed -i "s/127.0.1.1.*/127.0.1.1 /" /etc/hosts + hostnamectl --static # verify -> +' +``` + +That's it. **Logwatch has no hardcoded hostname override** — verify with: + +```bash +grep -ri hostname /etc/logwatch/ /etc/cron.daily/0logwatch /etc/cron.daily/logwatch 2>/dev/null +cat /etc/mailname 2>/dev/null +``` + +If those are empty (the normal case), Logwatch reads the live hostname on its next +run, so the **next daily report self-corrects** — no service restart, no logwatch +config change needed. + +> [!note] If `grep` *does* find a hostname pinned in `/etc/logwatch/conf/logwatch.conf` +> (e.g. a `HostLimit`/`MailFrom` line baked in by Ansible), update it there too — +> the override file wins over the live hostname. + +## Sweep the whole fleet + +This is a per-box provisioning leftover, so check every migrated host at once — +more than one is usually affected: + +```bash +for ip in 100.98.223.93 100.95.137.38 100.64.169.62 100.112.127.0 100.73.85.46; do + echo -n "$ip -> " + ssh -o ConnectTimeout=8 -o BatchMode=yes root@$ip 'hostnamectl --static' 2>/dev/null \ + || echo '(unreachable)' +done +``` + +Any value ending in `-hetzner` (or your provider's build label) needs the fix above. +In the 2026-06 sweep, `tttpod` and `dcaprod` were still `*-hetzner`; +`majortoot`, `majormail`, and `majorlinux` were already correct. + +## Prevention + +Fold "set the system hostname" into the migration bootstrap so it never drifts: + +```bash +hostnamectl set-hostname +sed -i "s/127.0.1.1.*/127.0.1.1 /" /etc/hosts +``` + +Do this in the **same step** that renames the Tailscale node and sets Postfix +`myhostname` — all three read from the provisioning label and all three must be +corrected together. See the +[VPS Migration Baseline Checklist](../02-selfhosting/cloud/vps-migration-baseline-checklist.md). + +## Related + +- [Logwatch Fleet Setup — Surviving Package Upgrades](../02-selfhosting/monitoring/logwatch-fleet-setup.md) — the broader "logwatch went silent / wrong-source" class, including the Packer `myhostname` variant of this same drift +- [VPS Migration Baseline Checklist](../02-selfhosting/cloud/vps-migration-baseline-checklist.md) — the full post-migration verification list +- [Ansible UNREACHABLE: Host Key Verification Failed After a Host Rebuild or Migration](networking/ansible-host-key-verification-failed-rebuilt-host.md) — another IP/identity-drift gotcha from the same Hetzner migration diff --git a/SUMMARY.md b/SUMMARY.md index bc5d2c8..68af294 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -137,5 +137,6 @@ updated: 2026-05-15T09:00 * [SSH Alias Falls Through to MagicDNS — Host-Key Verification Failure (No `Host` Block)](05-troubleshooting/networking/ssh-missing-host-block-magicdns-host-key-failure.md) * [MagicDNS Names vs Pinned IPs for Tailscale SSH (After a Fleet Migration)](05-troubleshooting/networking/tailscale-ssh-magicdns-vs-pinned-ip-after-migration.md) * [Ansible UNREACHABLE: Host Key Verification Failed After a Host Rebuild or Migration](05-troubleshooting/networking/ansible-host-key-verification-failed-rebuilt-host.md) + * [Logwatch Reports the Wrong Hostname (`-hetzner`) After a Migration](05-troubleshooting/logwatch-wrong-hostname-after-migration.md) * [Ghost EmailAnalytics Lag Warning — What It Means and When to Worry](05-troubleshooting/ghost-emailanalytics-lag-warning.md) * [claude-mem: --setting-sources Empty Arg Bug (Claude Code 2.1.x)](05-troubleshooting/claude-mem-setting-sources-empty-arg.md)