Merge branch 'code/majorair/logwatch-ca-bundle-docs'
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
3df0979786
4 changed files with 229 additions and 3 deletions
96
02-selfhosting/cloud/vps-migration-baseline-checklist.md
Normal file
96
02-selfhosting/cloud/vps-migration-baseline-checklist.md
Normal file
|
|
@ -0,0 +1,96 @@
|
|||
---
|
||||
title: VPS Migration Baseline Checklist
|
||||
description: What to verify after migrating a server to a new provider — the packages, services, and configs that must match the old box
|
||||
tags:
|
||||
- migration
|
||||
- vps
|
||||
- hetzner
|
||||
- digitalocean
|
||||
- ansible
|
||||
- checklist
|
||||
status: published
|
||||
created: 2026-05-09
|
||||
updated: 2026-05-11T07:33
|
||||
---
|
||||
|
||||
# VPS Migration Baseline Checklist
|
||||
|
||||
When migrating a server from one VPS provider to another, it's easy to focus on the application (bots, web services, databases) and forget the infrastructure baseline. This checklist covers the common components that make a server operational beyond just running the app.
|
||||
|
||||
## Background
|
||||
|
||||
During the Hetzner migration (2026-05), `majordiscord` was migrated with only the application layer (PhantomBot, Red-DiscordBot) and core infrastructure (Netdata, Tailscale, fail2ban). Missing from the new box: Postfix (email relay), logwatch, ClamAV, and dnf-automatic. The gap went unnoticed for a week because all monitoring email depended on the missing Postfix.
|
||||
|
||||
## The Checklist
|
||||
|
||||
### Before Migration
|
||||
|
||||
Power on both old and new boxes. Run this comparison to find gaps:
|
||||
|
||||
```bash
|
||||
# Fedora — list baseline packages on both hosts
|
||||
ssh root@OLD_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
|
||||
ssh root@NEW_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
|
||||
|
||||
# Ubuntu — list baseline packages on both hosts
|
||||
ssh root@OLD_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
|
||||
ssh root@NEW_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
|
||||
```
|
||||
|
||||
Compare enabled services:
|
||||
|
||||
```bash
|
||||
ssh root@HOST 'systemctl list-unit-files --state=enabled --no-pager | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld|sshd"'
|
||||
```
|
||||
|
||||
### Baseline Components
|
||||
|
||||
Every server in the fleet should have these. Check each one after migration:
|
||||
|
||||
| Component | Package (Fedora) | Package (Ubuntu) | Ansible Playbook | Notes |
|
||||
|-----------|-----------------|------------------|------------------|-------|
|
||||
| Monitoring | `netdata` | `netdata` | `netdata.yml` | Claim to Netdata Cloud if applicable |
|
||||
| VPN | `tailscale` | `tailscale` | — (manual join) | Rename node in Tailscale admin |
|
||||
| Intrusion prevention | `fail2ban` | `fail2ban` | `harden.yml` | Check jail.local, banaction matches firewall |
|
||||
| Email relay | `postfix` | `postfix` | `configure_postfix_relay.yml` | Required by logwatch, Netdata, fail2ban |
|
||||
| Log summaries | `logwatch` | `logwatch` | `logwatch.yml` | Override file, not defaults — see [logwatch fleet setup](../monitoring/logwatch-fleet-setup.md) |
|
||||
| Firewall | `firewalld` | `ufw` | `configure_firewall_*.yml` | Verify fail2ban banaction matches |
|
||||
| Cron | `cronie` | `cron` | — (usually pre-installed) | Required by logwatch |
|
||||
| Auto-updates | `dnf-automatic` | `unattended-upgrades` | `ansible-unattended-upgrades-fleet` | Security patches only |
|
||||
| Antivirus | `clamav` | `clamav` | `configure_clamav.yml` | Internet-facing hosts only |
|
||||
| SSH hardening | `openssh-server` | `openssh-server` | `configure_ssh_hardening.yml` | Key-only, no root password |
|
||||
| Timezone | — | — | — | US servers: `America/New_York`; UK: `Europe/London`. Hetzner defaults to UTC. |
|
||||
| CA bundle (Fedora) | `ca-certificates` | `ca-certificates` | — | Verify `/etc/pki/tls/certs/ca-bundle.crt` symlink exists — see [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md) |
|
||||
|
||||
### After Migration
|
||||
|
||||
1. **Set the timezone** — `timedatectl set-timezone America/New_York` (US) or `Europe/London` (UK). Hetzner images default to UTC.
|
||||
2. **Verify CA bundle (Fedora)** — `ls /etc/pki/tls/certs/ca-bundle.crt`. If missing, Postfix TLS, curl, and dnf will all fail silently. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md).
|
||||
3. **Run `harden.yml` against the new host** — catches most gaps in one pass
|
||||
4. **Send a test email** — `echo test | mail -s "test" marcus@majorshouse.com` — if this fails, nothing else can alert you
|
||||
5. **Verify crond is running** — `systemctl is-active crond` (Fedora) or `systemctl is-active cron` (Ubuntu). cronie can be `enabled` but not `active` after provisioning.
|
||||
6. **Check Netdata Cloud** — verify the new node appears and alerts are flowing
|
||||
7. **Compare fail2ban jails** — `fail2ban-client status` on both old and new
|
||||
8. **Verify logwatch sends** — `sudo logwatch --output mail --range today`
|
||||
9. **Keep the old box powered off but not destroyed** for at least 7 days after remediation
|
||||
|
||||
### Using doctl to Manage Old Droplets
|
||||
|
||||
```bash
|
||||
# Authenticate (token from Ansible vault)
|
||||
cd ~/MajorAnsible
|
||||
ansible-vault view group_vars/all/vault.yml | grep vault_do_oauth_token | awk '{print $2}' | xargs doctl auth init --access-token
|
||||
|
||||
# List droplets
|
||||
doctl compute droplet list --format Name,ID,Status,PublicIPv4
|
||||
|
||||
# Power on for comparison
|
||||
doctl compute droplet-action power-on DROPLET_ID
|
||||
|
||||
# Power off when done
|
||||
doctl compute droplet-action power-off DROPLET_ID
|
||||
```
|
||||
|
||||
## Lesson Learned
|
||||
|
||||
Application migration is not server migration. The app can work perfectly while the monitoring, alerting, and email infrastructure is completely broken. Always compare the full package baseline between old and new boxes before calling a migration complete.
|
||||
|
|
@ -9,7 +9,7 @@ tags:
|
|||
- ubuntu
|
||||
status: published
|
||||
created: 2026-05-09
|
||||
updated: 2026-05-10T13:00
|
||||
updated: 2026-05-11T07:37
|
||||
---
|
||||
|
||||
# Logwatch Fleet Setup — Surviving Package Upgrades
|
||||
|
|
@ -91,10 +91,22 @@ Include it in `harden.yml` so every new server gets logwatch as part of the base
|
|||
After deploying, test immediately:
|
||||
|
||||
```bash
|
||||
# Verify crond is actually running — cronie can be "enabled" but not "active"
|
||||
systemctl is-active crond # Fedora
|
||||
systemctl is-active cron # Ubuntu
|
||||
|
||||
# If inactive, start it
|
||||
sudo systemctl start crond
|
||||
|
||||
# Then test logwatch manually
|
||||
sudo logwatch --output mail --range today
|
||||
```
|
||||
|
||||
Check that the email arrives. If it doesn't, verify Postfix is installed and relaying correctly — logwatch depends on a working local MTA.
|
||||
Check that the email arrives. If it doesn't, verify:
|
||||
|
||||
1. **crond is running** — if `inactive`, cron.daily never fires and logwatch never runs. No errors anywhere.
|
||||
2. **Postfix is installed and relaying** — logwatch depends on a working local MTA.
|
||||
3. **CA bundle exists (Fedora)** — missing `/etc/pki/tls/certs/ca-bundle.crt` breaks Postfix TLS relay. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md).
|
||||
|
||||
## Diagnosing Silent Failures
|
||||
|
||||
|
|
|
|||
116
05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md
Normal file
116
05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md
Normal file
|
|
@ -0,0 +1,116 @@
|
|||
---
|
||||
title: "Fedora CA Bundle Missing Symlink — TLS Breaks Fleet-Wide"
|
||||
description: Hetzner-provisioned Fedora images may be missing the /etc/pki/tls/certs/ca-bundle.crt symlink, silently breaking Postfix TLS relay, curl, and dnf
|
||||
tags:
|
||||
- fedora
|
||||
- tls
|
||||
- postfix
|
||||
- ca-certificates
|
||||
- hetzner
|
||||
- troubleshooting
|
||||
status: published
|
||||
created: 2026-05-11
|
||||
updated: 2026-05-11
|
||||
---
|
||||
|
||||
# Fedora CA Bundle Missing Symlink
|
||||
|
||||
On Fedora, many TLS clients (Postfix, curl, dnf) look for the CA bundle at `/etc/pki/tls/certs/ca-bundle.crt`. This path is normally a symlink to `/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem`, shipped by the `ca-certificates` package.
|
||||
|
||||
On Hetzner Cloud Fedora images (observed on Fedora 44, May 2026), this symlink can be missing despite `ca-certificates` being installed. The extracted bundle exists, but the consumer-facing symlink does not.
|
||||
|
||||
## Symptoms
|
||||
|
||||
Postfix relay to a TLS-required upstream fails:
|
||||
|
||||
```
|
||||
postfix/smtp: cannot load Certification Authority data,
|
||||
CAfile="/etc/pki/tls/certs/ca-bundle.crt",
|
||||
CApath="/etc/pki/tls/certs": disabling TLS support
|
||||
```
|
||||
|
||||
If your relay requires TLS (port 465 with `smtp_tls_wrappermode = yes`, or `smtp_tls_security_level = encrypt`), mail silently queues as deferred. No bounce, no alert — just silence.
|
||||
|
||||
Other symptoms on the same box:
|
||||
|
||||
```bash
|
||||
# curl fails
|
||||
curl https://example.com
|
||||
# error: Problem with the SSL CA cert (path? access rights?)
|
||||
|
||||
# dnf fails
|
||||
dnf list --installed
|
||||
# Curl error (77): Problem with the SSL CA cert
|
||||
```
|
||||
|
||||
## Diagnosis
|
||||
|
||||
```bash
|
||||
# Check the symlink
|
||||
ls -la /etc/pki/tls/certs/ca-bundle.crt
|
||||
# Expected: symlink -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
|
||||
# Broken: "No such file or directory"
|
||||
|
||||
# Verify the extracted bundle exists
|
||||
ls -la /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
|
||||
# Should exist (~220 KB, ~140-150 certs)
|
||||
|
||||
# Confirm the package is installed
|
||||
rpm -q ca-certificates
|
||||
# Should return a version string
|
||||
```
|
||||
|
||||
If the extracted bundle exists but the symlink at `/etc/pki/tls/certs/ca-bundle.crt` is missing, that's the problem.
|
||||
|
||||
## Fix
|
||||
|
||||
```bash
|
||||
sudo ln -sf /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem \
|
||||
/etc/pki/tls/certs/ca-bundle.crt
|
||||
sudo systemctl restart postfix
|
||||
sudo postqueue -f # flush any deferred mail
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
# Symlink exists
|
||||
ls -la /etc/pki/tls/certs/ca-bundle.crt
|
||||
|
||||
# Postfix can relay
|
||||
echo "Subject: TLS test" | sendmail -v marcus@majorshouse.com
|
||||
|
||||
# curl works
|
||||
curl -sI https://example.com | head -1
|
||||
```
|
||||
|
||||
## Fleet Audit
|
||||
|
||||
If one Hetzner-provisioned Fedora host has this issue, check the others:
|
||||
|
||||
```bash
|
||||
for host in majordiscord majorlab majorhome majormail; do
|
||||
echo "$host: $(ssh root@$host 'ls /etc/pki/tls/certs/ca-bundle.crt 2>&1' | tail -1)"
|
||||
done
|
||||
```
|
||||
|
||||
Hosts returning "No such file or directory" are silently broken for all TLS operations.
|
||||
|
||||
## Why This Happens
|
||||
|
||||
`update-ca-trust extract` regenerates the files under `/etc/pki/ca-trust/extracted/` but does not create the legacy consumer-path symlink at `/etc/pki/tls/certs/ca-bundle.crt`. That symlink is shipped by the `ca-certificates` RPM. On cloud images built from minimal installs or snapshot-based provisioning, the symlink can be lost during image creation or a partial upgrade.
|
||||
|
||||
## Prevention
|
||||
|
||||
Add to your provisioning checklist (see [VPS Migration Baseline Checklist](../../02-selfhosting/cloud/vps-migration-baseline-checklist.md)):
|
||||
|
||||
```bash
|
||||
# Fedora provisioning — verify CA bundle symlink
|
||||
ls /etc/pki/tls/certs/ca-bundle.crt || \
|
||||
ln -sf /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem /etc/pki/tls/certs/ca-bundle.crt
|
||||
```
|
||||
|
||||
## Related
|
||||
|
||||
- [Logwatch Fleet Setup](../../02-selfhosting/monitoring/logwatch-fleet-setup.md) — logwatch depends on a working Postfix relay, which depends on TLS, which depends on this symlink
|
||||
- [VPS Migration Baseline Checklist](../../02-selfhosting/cloud/vps-migration-baseline-checklist.md) — includes CA bundle verification step
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
created: 2026-04-02T16:03
|
||||
updated: 2026-05-10T00:10
|
||||
updated: 2026-05-11T07:35
|
||||
---
|
||||
* [Home](index.md)
|
||||
* [Linux & Sysadmin](01-linux/index.md)
|
||||
|
|
@ -28,6 +28,7 @@ updated: 2026-05-10T00:10
|
|||
* [Wake-on-LAN via Router SSH](02-selfhosting/dns-networking/wake-on-lan-router-ssh.md)
|
||||
* [Pi-hole v6 Group Management — Per-Client DNS Rules](02-selfhosting/dns-networking/pihole-v6-group-management.md)
|
||||
* [AWS S3 Cost Management](02-selfhosting/cloud/aws-s3-cost-management.md)
|
||||
* [VPS Migration Baseline Checklist](02-selfhosting/cloud/vps-migration-baseline-checklist.md)
|
||||
* [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md)
|
||||
* [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md)
|
||||
* [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md)
|
||||
|
|
@ -105,6 +106,7 @@ updated: 2026-05-10T00:10
|
|||
* [iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators](05-troubleshooting/networking/tailscale-status-json-hostname-localhost-ios.md)
|
||||
* [macOS: Repeating Alert Tone from Mirrored iPhone Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md)
|
||||
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
|
||||
* [Fedora CA Bundle Missing Symlink — TLS Breaks Fleet-Wide](05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md)
|
||||
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)
|
||||
* [Ansible: ansible.cfg Ignored on WSL2 Windows Mounts](05-troubleshooting/ansible-wsl2-world-writable-mount-ignores-cfg.md)
|
||||
* [Ansible: SSH Timeout During dnf upgrade on Fedora Hosts](05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md)
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue