majorwiki/02-selfhosting/cloud/vps-migration-baseline-checklist.md
Marcus Summers 06a794316b docs: point Ansible references at the new roles (clamav/ssh_hardening/tailscale)
Operational/how-to references updated to the role entry playbooks after the
ADR-0001 migration. Historical incident narrative (dated callouts, commit
refs) preserved.

- clamav-fleet-deployment: override + re-run -> clamav.yml; role note
- ssh-hardening-ansible-fleet: note this is now the ssh_hardening role
- vps-migration-baseline-checklist: table -> clamav.yml / ssh_hardening.yml
- ssh-socket-tailscale-race-condition: Affected Hosts + Prevention + References
  -> tailscale role tasks (network_wait/ssh_only_ubuntu/ssh_only_fedora)
- freshclam-logwatch-false-no-updates: codify refs -> clamav role
2026-06-11 11:33:42 -04:00

5.6 KiB

title description tags status created updated
VPS Migration Baseline Checklist What to verify after migrating a server to a new provider — the packages, services, and configs that must match the old box
migration
vps
hetzner
digitalocean
ansible
checklist
published 2026-05-09 2026-05-13T10:35

VPS Migration Baseline Checklist

When migrating a server from one VPS provider to another, it's easy to focus on the application (bots, web services, databases) and forget the infrastructure baseline. This checklist covers the common components that make a server operational beyond just running the app.

Background

During the Hetzner migration (2026-05), majordiscord was migrated with only the application layer (PhantomBot, Red-DiscordBot) and core infrastructure (Netdata, Tailscale, fail2ban). Missing from the new box: Postfix (email relay), logwatch, ClamAV, and dnf-automatic. The gap went unnoticed for a week because all monitoring email depended on the missing Postfix.

The Checklist

Before Migration

Power on both old and new boxes. Run this comparison to find gaps:

# Fedora — list baseline packages on both hosts
ssh root@OLD_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
ssh root@NEW_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'

# Ubuntu — list baseline packages on both hosts
ssh root@OLD_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
ssh root@NEW_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'

Compare enabled services:

ssh root@HOST 'systemctl list-unit-files --state=enabled --no-pager | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld|sshd"'

Baseline Components

Every server in the fleet should have these. Check each one after migration:

Component Package (Fedora) Package (Ubuntu) Ansible Playbook Notes
Monitoring netdata netdata netdata.yml Claim to Netdata Cloud if applicable
VPN tailscale tailscale — (manual join) Rename node in Tailscale admin
Intrusion prevention fail2ban fail2ban harden.yml Check jail.local, banaction matches firewall
Email relay postfix postfix configure_postfix_relay.yml Required by logwatch, Netdata, fail2ban
Log summaries logwatch logwatch logwatch.yml Override file, not defaults — see logwatch fleet setup
Firewall firewalld ufw configure_firewall_*.yml Verify fail2ban banaction matches
Cron cronie cron — (usually pre-installed) Required by logwatch
Auto-updates dnf-automatic unattended-upgrades ansible-unattended-upgrades-fleet Security patches only
Antivirus clamav clamav clamav.yml (clamav role) Internet-facing hosts only
SSH hardening openssh-server openssh-server ssh_hardening.yml (ssh_hardening role) Key-only, no root password
Timezone US servers: America/New_York; UK: Europe/London. Hetzner defaults to UTC.
CA bundle (Fedora) ca-certificates ca-certificates Verify /etc/pki/tls/certs/ca-bundle.crt symlink exists — see Fedora CA bundle fix
Syslog (Fedora) rsyslog — (pre-installed) Fedora 44 Hetzner images have journald only. Logwatch needs /var/log/messages + /var/log/secure.

After Migration

  1. Set the timezonetimedatectl set-timezone America/New_York (US) or Europe/London (UK). Hetzner images default to UTC.
  2. Verify CA bundle (Fedora)ls /etc/pki/tls/certs/ca-bundle.crt. If missing, Postfix TLS, curl, and dnf will all fail silently. See Fedora CA bundle fix.
  3. Run harden.yml against the new host — catches most gaps in one pass
  4. Send a test emailecho test | mail -s "test" marcus@majorshouse.com — if this fails, nothing else can alert you
  5. Verify crond is runningsystemctl is-active crond (Fedora) or systemctl is-active cron (Ubuntu). cronie can be enabled but not active after provisioning.
  6. Check Netdata Cloud — verify the new node appears and alerts are flowing
  7. Compare fail2ban jailsfail2ban-client status on both old and new
  8. Verify logwatch sendssudo logwatch --output mail --range today
  9. Keep the old box powered off but not destroyed for at least 7 days after remediation

Using doctl to Manage Old Droplets

# Authenticate (token from Ansible vault)
cd ~/MajorAnsible
ansible-vault view group_vars/all/vault.yml | grep vault_do_oauth_token | awk '{print $2}' | xargs doctl auth init --access-token

# List droplets
doctl compute droplet list --format Name,ID,Status,PublicIPv4

# Power on for comparison
doctl compute droplet-action power-on DROPLET_ID

# Power off when done
doctl compute droplet-action power-off DROPLET_ID

Lesson Learned

Application migration is not server migration. The app can work perfectly while the monitoring, alerting, and email infrastructure is completely broken. Always compare the full package baseline between old and new boxes before calling a migration complete.