Merge branch 'code/majorair/logwatch-ca-bundle-docs'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Marcus Summers 2026-05-11 07:37:48 -04:00
commit 3df0979786
4 changed files with 229 additions and 3 deletions

View file

@ -0,0 +1,96 @@
---
title: VPS Migration Baseline Checklist
description: What to verify after migrating a server to a new provider — the packages, services, and configs that must match the old box
tags:
- migration
- vps
- hetzner
- digitalocean
- ansible
- checklist
status: published
created: 2026-05-09
updated: 2026-05-11T07:33
---
# VPS Migration Baseline Checklist
When migrating a server from one VPS provider to another, it's easy to focus on the application (bots, web services, databases) and forget the infrastructure baseline. This checklist covers the common components that make a server operational beyond just running the app.
## Background
During the Hetzner migration (2026-05), `majordiscord` was migrated with only the application layer (PhantomBot, Red-DiscordBot) and core infrastructure (Netdata, Tailscale, fail2ban). Missing from the new box: Postfix (email relay), logwatch, ClamAV, and dnf-automatic. The gap went unnoticed for a week because all monitoring email depended on the missing Postfix.
## The Checklist
### Before Migration
Power on both old and new boxes. Run this comparison to find gaps:
```bash
# Fedora — list baseline packages on both hosts
ssh root@OLD_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
ssh root@NEW_HOST 'rpm -qa --qf "%{NAME}\n" | sort | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld"'
# Ubuntu — list baseline packages on both hosts
ssh root@OLD_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
ssh root@NEW_HOST 'dpkg -l | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|unattended|tailscale" | awk "{print \$2}" | sort'
```
Compare enabled services:
```bash
ssh root@HOST 'systemctl list-unit-files --state=enabled --no-pager | grep -iE "fail2ban|logwatch|postfix|netdata|clamav|dnf-auto|tailscale|cronie|firewalld|sshd"'
```
### Baseline Components
Every server in the fleet should have these. Check each one after migration:
| Component | Package (Fedora) | Package (Ubuntu) | Ansible Playbook | Notes |
|-----------|-----------------|------------------|------------------|-------|
| Monitoring | `netdata` | `netdata` | `netdata.yml` | Claim to Netdata Cloud if applicable |
| VPN | `tailscale` | `tailscale` | — (manual join) | Rename node in Tailscale admin |
| Intrusion prevention | `fail2ban` | `fail2ban` | `harden.yml` | Check jail.local, banaction matches firewall |
| Email relay | `postfix` | `postfix` | `configure_postfix_relay.yml` | Required by logwatch, Netdata, fail2ban |
| Log summaries | `logwatch` | `logwatch` | `logwatch.yml` | Override file, not defaults — see [logwatch fleet setup](../monitoring/logwatch-fleet-setup.md) |
| Firewall | `firewalld` | `ufw` | `configure_firewall_*.yml` | Verify fail2ban banaction matches |
| Cron | `cronie` | `cron` | — (usually pre-installed) | Required by logwatch |
| Auto-updates | `dnf-automatic` | `unattended-upgrades` | `ansible-unattended-upgrades-fleet` | Security patches only |
| Antivirus | `clamav` | `clamav` | `configure_clamav.yml` | Internet-facing hosts only |
| SSH hardening | `openssh-server` | `openssh-server` | `configure_ssh_hardening.yml` | Key-only, no root password |
| Timezone | — | — | — | US servers: `America/New_York`; UK: `Europe/London`. Hetzner defaults to UTC. |
| CA bundle (Fedora) | `ca-certificates` | `ca-certificates` | — | Verify `/etc/pki/tls/certs/ca-bundle.crt` symlink exists — see [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md) |
### After Migration
1. **Set the timezone**`timedatectl set-timezone America/New_York` (US) or `Europe/London` (UK). Hetzner images default to UTC.
2. **Verify CA bundle (Fedora)**`ls /etc/pki/tls/certs/ca-bundle.crt`. If missing, Postfix TLS, curl, and dnf will all fail silently. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md).
3. **Run `harden.yml` against the new host** — catches most gaps in one pass
4. **Send a test email**`echo test | mail -s "test" marcus@majorshouse.com` — if this fails, nothing else can alert you
5. **Verify crond is running**`systemctl is-active crond` (Fedora) or `systemctl is-active cron` (Ubuntu). cronie can be `enabled` but not `active` after provisioning.
6. **Check Netdata Cloud** — verify the new node appears and alerts are flowing
7. **Compare fail2ban jails**`fail2ban-client status` on both old and new
8. **Verify logwatch sends**`sudo logwatch --output mail --range today`
9. **Keep the old box powered off but not destroyed** for at least 7 days after remediation
### Using doctl to Manage Old Droplets
```bash
# Authenticate (token from Ansible vault)
cd ~/MajorAnsible
ansible-vault view group_vars/all/vault.yml | grep vault_do_oauth_token | awk '{print $2}' | xargs doctl auth init --access-token
# List droplets
doctl compute droplet list --format Name,ID,Status,PublicIPv4
# Power on for comparison
doctl compute droplet-action power-on DROPLET_ID
# Power off when done
doctl compute droplet-action power-off DROPLET_ID
```
## Lesson Learned
Application migration is not server migration. The app can work perfectly while the monitoring, alerting, and email infrastructure is completely broken. Always compare the full package baseline between old and new boxes before calling a migration complete.

View file

@ -9,7 +9,7 @@ tags:
- ubuntu
status: published
created: 2026-05-09
updated: 2026-05-10T13:00
updated: 2026-05-11T07:37
---
# Logwatch Fleet Setup — Surviving Package Upgrades
@ -91,10 +91,22 @@ Include it in `harden.yml` so every new server gets logwatch as part of the base
After deploying, test immediately:
```bash
# Verify crond is actually running — cronie can be "enabled" but not "active"
systemctl is-active crond # Fedora
systemctl is-active cron # Ubuntu
# If inactive, start it
sudo systemctl start crond
# Then test logwatch manually
sudo logwatch --output mail --range today
```
Check that the email arrives. If it doesn't, verify Postfix is installed and relaying correctly — logwatch depends on a working local MTA.
Check that the email arrives. If it doesn't, verify:
1. **crond is running** — if `inactive`, cron.daily never fires and logwatch never runs. No errors anywhere.
2. **Postfix is installed and relaying** — logwatch depends on a working local MTA.
3. **CA bundle exists (Fedora)** — missing `/etc/pki/tls/certs/ca-bundle.crt` breaks Postfix TLS relay. See [Fedora CA bundle fix](../../05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md).
## Diagnosing Silent Failures

View file

@ -0,0 +1,116 @@
---
title: "Fedora CA Bundle Missing Symlink — TLS Breaks Fleet-Wide"
description: Hetzner-provisioned Fedora images may be missing the /etc/pki/tls/certs/ca-bundle.crt symlink, silently breaking Postfix TLS relay, curl, and dnf
tags:
- fedora
- tls
- postfix
- ca-certificates
- hetzner
- troubleshooting
status: published
created: 2026-05-11
updated: 2026-05-11
---
# Fedora CA Bundle Missing Symlink
On Fedora, many TLS clients (Postfix, curl, dnf) look for the CA bundle at `/etc/pki/tls/certs/ca-bundle.crt`. This path is normally a symlink to `/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem`, shipped by the `ca-certificates` package.
On Hetzner Cloud Fedora images (observed on Fedora 44, May 2026), this symlink can be missing despite `ca-certificates` being installed. The extracted bundle exists, but the consumer-facing symlink does not.
## Symptoms
Postfix relay to a TLS-required upstream fails:
```
postfix/smtp: cannot load Certification Authority data,
CAfile="/etc/pki/tls/certs/ca-bundle.crt",
CApath="/etc/pki/tls/certs": disabling TLS support
```
If your relay requires TLS (port 465 with `smtp_tls_wrappermode = yes`, or `smtp_tls_security_level = encrypt`), mail silently queues as deferred. No bounce, no alert — just silence.
Other symptoms on the same box:
```bash
# curl fails
curl https://example.com
# error: Problem with the SSL CA cert (path? access rights?)
# dnf fails
dnf list --installed
# Curl error (77): Problem with the SSL CA cert
```
## Diagnosis
```bash
# Check the symlink
ls -la /etc/pki/tls/certs/ca-bundle.crt
# Expected: symlink -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
# Broken: "No such file or directory"
# Verify the extracted bundle exists
ls -la /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
# Should exist (~220 KB, ~140-150 certs)
# Confirm the package is installed
rpm -q ca-certificates
# Should return a version string
```
If the extracted bundle exists but the symlink at `/etc/pki/tls/certs/ca-bundle.crt` is missing, that's the problem.
## Fix
```bash
sudo ln -sf /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem \
/etc/pki/tls/certs/ca-bundle.crt
sudo systemctl restart postfix
sudo postqueue -f # flush any deferred mail
```
Verify:
```bash
# Symlink exists
ls -la /etc/pki/tls/certs/ca-bundle.crt
# Postfix can relay
echo "Subject: TLS test" | sendmail -v marcus@majorshouse.com
# curl works
curl -sI https://example.com | head -1
```
## Fleet Audit
If one Hetzner-provisioned Fedora host has this issue, check the others:
```bash
for host in majordiscord majorlab majorhome majormail; do
echo "$host: $(ssh root@$host 'ls /etc/pki/tls/certs/ca-bundle.crt 2>&1' | tail -1)"
done
```
Hosts returning "No such file or directory" are silently broken for all TLS operations.
## Why This Happens
`update-ca-trust extract` regenerates the files under `/etc/pki/ca-trust/extracted/` but does not create the legacy consumer-path symlink at `/etc/pki/tls/certs/ca-bundle.crt`. That symlink is shipped by the `ca-certificates` RPM. On cloud images built from minimal installs or snapshot-based provisioning, the symlink can be lost during image creation or a partial upgrade.
## Prevention
Add to your provisioning checklist (see [VPS Migration Baseline Checklist](../../02-selfhosting/cloud/vps-migration-baseline-checklist.md)):
```bash
# Fedora provisioning — verify CA bundle symlink
ls /etc/pki/tls/certs/ca-bundle.crt || \
ln -sf /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem /etc/pki/tls/certs/ca-bundle.crt
```
## Related
- [Logwatch Fleet Setup](../../02-selfhosting/monitoring/logwatch-fleet-setup.md) — logwatch depends on a working Postfix relay, which depends on TLS, which depends on this symlink
- [VPS Migration Baseline Checklist](../../02-selfhosting/cloud/vps-migration-baseline-checklist.md) — includes CA bundle verification step

View file

@ -1,6 +1,6 @@
---
created: 2026-04-02T16:03
updated: 2026-05-10T00:10
updated: 2026-05-11T07:35
---
* [Home](index.md)
* [Linux & Sysadmin](01-linux/index.md)
@ -28,6 +28,7 @@ updated: 2026-05-10T00:10
* [Wake-on-LAN via Router SSH](02-selfhosting/dns-networking/wake-on-lan-router-ssh.md)
* [Pi-hole v6 Group Management — Per-Client DNS Rules](02-selfhosting/dns-networking/pihole-v6-group-management.md)
* [AWS S3 Cost Management](02-selfhosting/cloud/aws-s3-cost-management.md)
* [VPS Migration Baseline Checklist](02-selfhosting/cloud/vps-migration-baseline-checklist.md)
* [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md)
* [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md)
* [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md)
@ -105,6 +106,7 @@ updated: 2026-05-10T00:10
* [iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators](05-troubleshooting/networking/tailscale-status-json-hostname-localhost-ios.md)
* [macOS: Repeating Alert Tone from Mirrored iPhone Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md)
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
* [Fedora CA Bundle Missing Symlink — TLS Breaks Fleet-Wide](05-troubleshooting/security/fedora-ca-bundle-missing-symlink.md)
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)
* [Ansible: ansible.cfg Ignored on WSL2 Windows Mounts](05-troubleshooting/ansible-wsl2-world-writable-mount-ignores-cfg.md)
* [Ansible: SSH Timeout During dnf upgrade on Fedora Hosts](05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md)