majorwiki/02-selfhosting/cloud/vps-migration-baseline-checklist.md
Marcus Summers 0d1697c0d6 wiki: Logwatch wrong hostname (<host>-hetzner) after migration
New troubleshooting runbook for Logwatch reports titled with the Hetzner
provisioning label instead of the real hostname; cross-linked from the
logwatch fleet-setup and VPS migration baseline articles, plus a new
'set system hostname' step in the post-migration checklist.
2026-06-12 10:58:17 -04:00

98 lines
6.1 KiB
Markdown

---
title: VPS Migration Baseline Checklist
description: What to verify after migrating a server to a new provider — the packages, services, and configs that must match the old box
tags:
- migration
- vps
- hetzner
- digitalocean
- ansible
- checklist
status: published
created: 2026-05-09
updated: 2026-05-13T10:35
---
# VPS Migration Baseline Checklist
When migrating a server from one VPS provider to another, it's easy to focus on the application (bots, web services, databases) and forget the infrastructure baseline. This checklist covers the common components that make a server operational beyond just running the app.
## Background
During the Hetzner migration (2026-05), `majordiscord` was migrated with only the application layer (PhantomBot, Red-DiscordBot) and core infrastructure (Netdata, Tailscale, fail2ban). Missing from the new box: Postfix (email relay), logwatch, ClamAV, and dnf-automatic. The gap went unnoticed for a week because all monitoring email depended on the missing Postfix.
## The Checklist
### Before Migration
Power on both old and new boxes. Run this comparison to find gaps:
```bash
# Fedora — list baseline packages on both hosts
ssh root@OLD_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
ssh root@NEW_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
# Ubuntu — list baseline packages on both hosts
ssh root@OLD_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
ssh root@NEW_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
```
Compare enabled services:
```bash
ssh root@HOST 'systemctl list-unit-files --state=enabled --no-pager | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld|sshd"'
```
### Baseline Components
Every server in the fleet should have these. Check each one after migration:
| Component | Package (Fedora) | Package (Ubuntu) | Ansible Playbook | Notes |
|-----------|-----------------|------------------|------------------|-------|
| Monitoring | `netdata` | `netdata` | `netdata.yml` | Claim to Netdata Cloud if applicable |
| VPN | `tailscale` | `tailscale` | — (manual join) | Rename node in Tailscale admin |
| Intrusion prevention | `fail2ban` | `fail2ban` | `harden.yml` | Check jail.local, banaction matches firewall |
| Email relay | `postfix` | `postfix` | `configure_postfix_relay.yml` | Required by logwatch, Netdata, fail2ban |
| Log summaries | `logwatch` | `logwatch` | `logwatch.yml` | Override file, not defaults — see [logwatch fleet setup](../monitoring/logwatch-fleet-setup.md) |
| Firewall | `firewalld` | `ufw` | `configure_firewall_*.yml` | Verify fail2ban banaction matches |
| Cron | `cronie` | `cron` | — (usually pre-installed) | Required by logwatch |
| Auto-updates | `dnf-automatic` | `unattended-upgrades` | `ansible-unattended-upgrades-fleet` | Security patches only |
| Antivirus | `clamav` | `clamav` | `clamav.yml` (clamav role) | Internet-facing hosts only |
| SSH hardening | `openssh-server` | `openssh-server` | `ssh_hardening.yml` (ssh_hardening role) | Key-only, no root password |
| Timezone | — | — | — | US servers: `America/New_York`; UK: `Europe/London`. Hetzner defaults to UTC. |
| CA bundle (Fedora) | `ca-certificates` | `ca-certificates` | — | Verify `/etc/pki/tls/certs/ca-bundle.crt` symlink exists — see [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md) |
| Syslog (Fedora) | `rsyslog` | — (pre-installed) | — | Fedora 44 Hetzner images have journald only. Logwatch needs `/var/log/messages` + `/var/log/secure`. |
### After Migration
1. **Set the timezone**`timedatectl set-timezone America/New_York` (US) or `Europe/London` (UK). Hetzner images default to UTC.
2. **Set the system hostname** — Hetzner provisions the box as `<host>-hetzner`. Run `hostnamectl set-hostname <host>` and fix the loopback line: `sed -i "s/127.0.1.1.*/127.0.1.1 <host> <host>/" /etc/hosts`. Skip this and **Logwatch emails arrive titled `Logwatch for <host>-hetzner`** weeks later. Do it alongside the Tailscale node rename and Postfix `myhostname` — all three read from the provisioning label. See [Logwatch wrong hostname after migration](../../05-troubleshooting/logwatch-wrong-hostname-after-migration.md).
3. **Verify CA bundle (Fedora)**`ls /etc/pki/tls/certs/ca-bundle.crt`. If missing, Postfix TLS, curl, and dnf will all fail silently. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md).
4. **Run `harden.yml` against the new host** — catches most gaps in one pass
5. **Send a test email**`echo test | mail -s "test" marcus@majorshouse.com` — if this fails, nothing else can alert you
6. **Verify crond is running**`systemctl is-active crond` (Fedora) or `systemctl is-active cron` (Ubuntu). cronie can be `enabled` but not `active` after provisioning.
7. **Check Netdata Cloud** — verify the new node appears and alerts are flowing
8. **Compare fail2ban jails**`fail2ban-client status` on both old and new
9. **Verify logwatch sends**`sudo logwatch --output mail --range today`
10. **Keep the old box powered off but not destroyed** for at least 7 days after remediation
### Using doctl to Manage Old Droplets
```bash
# Authenticate (token from Ansible vault)
cd ~/MajorAnsible
ansible-vault view group_vars/all/vault.yml | grep vault_do_oauth_token | awk '{print $2}' | xargs doctl auth init --access-token
# List droplets
doctl compute droplet list --format Name,ID,Status,PublicIPv4
# Power on for comparison
doctl compute droplet-action power-on DROPLET_ID
# Power off when done
doctl compute droplet-action power-off DROPLET_ID
```
## Lesson Learned
Application migration is not server migration. The app can work perfectly while the monitoring, alerting, and email infrastructure is completely broken. Always compare the full package baseline between old and new boxes before calling a migration complete.