Architecture, per-engine DB dump patterns, restore procedure, add-a-host, and gotchas (RESTIC_CACHE_DIR/$HOME, missing sqlite3, docker dump env vars, delete-capable B2 key). Linked in SUMMARY under storage-backup.
7.2 KiB
| title | domain | category | tags | status | created | updated | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| App-Consistent Fleet Backups with restic + Backblaze B2 | selfhosting | storage-backup |
|
published | 2026-06-19 | 2026-06-19 |
App-Consistent Fleet Backups with restic + Backblaze B2
A repeatable pattern for backing up a mixed fleet (Ubuntu + Fedora, VPS + homelab, bare services + Docker) to Backblaze B2 with restic — encrypted, deduplicated, and app-consistent (databases are dumped before the snapshot, not copied live). Driven by Ansible and a per-host systemd timer.
The Short Answer
Per host, nightly: dump every database to a staging dir → restic backup that staging dir plus the data paths → apply retention → wipe staging. A monthly timer runs restic prune. Anything that fails emails the admin. One B2 bucket holds a separate repo per host at b2:<bucket>:<hostname>.
Retention is --keep-daily 7 --keep-weekly 4 --keep-monthly 6 (~6 months of history).
Why dump databases first
Copying a live database's files (/var/lib/mysql, a running SQLite file, a Postgres data dir) gives you a crash-consistent copy at best — restorable only if you're lucky. Logical dumps are guaranteed consistent:
- MySQL / MariaDB:
mysqldump --single-transaction --routines --triggers --databases <db> - PostgreSQL:
pg_dump -Fc <db>(custom format) via thepostgressystem user (peer auth) - SQLite:
sqlite3 <file> ".backup '<out>'"— uses the online backup API, safe against a running writer - Dockerized DBs:
docker exec <container> sh -c '<dump cmd>', letting the container's own shell expand its root-password env var
restic then backs up the dump files (which dedupe beautifully — only the changed blocks upload each night).
Repository layout
- One private B2 bucket (e.g.
majorshouse-backups). - One repo per host:
b2:majorshouse-backups:<hostname>. - The application key needs read + write + delete for the bucket. restic deletes objects during
forget/prune, so a pure append-only key will break retention. (True append-only requires splittingforget/pruneonto a separate maintenance key — a worthwhile hardening step, but not the default.) - Credentials live in an
EnvironmentFile(/etc/restic/restic-env, mode0600, root):RESTIC_REPOSITORY,RESTIC_PASSWORD,B2_ACCOUNT_ID,B2_ACCOUNT_KEY.
The backup script (shape)
set -uo pipefail
STAGING=/var/backups/restic-staging
rm -rf "$STAGING"; mkdir -p "$STAGING"; chmod 700 "$STAGING"
# per-engine dumps into $STAGING ...
mysqldump --single-transaction --routines --triggers --databases wordpress > "$STAGING/mysql-wordpress.sql"
sudo -u postgres pg_dump -Fc mastodon_production > "$STAGING/pg-mastodon_production.dump"
sqlite3 /opt/phantombot/config/phantombot.db ".backup '$STAGING/sqlite-phantombot.db'"
restic backup --tag fleet-backup --host "$(hostname -s)" \
"$STAGING" /var/www /etc/letsencrypt --exclude /path/to/already-offsite/media
restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 6
rm -rf "$STAGING"
Wrap each step so a failure mails the admin and aborts (don't silently back up a half-state). On hosts where the mail CLI is absent, pipe a message to /usr/sbin/sendmail -t instead.
systemd units
A oneshot service + a timer. Stagger OnCalendar per host to spread B2 load, and always set RESTIC_CACHE_DIR (see Gotchas):
# restic-backup.service
[Service]
Type=oneshot
EnvironmentFile=/etc/restic/restic-env
Environment=RESTIC_CACHE_DIR=/var/cache/restic
ExecStart=/usr/local/sbin/restic-backup.sh
Nice=10
IOSchedulingClass=idle
# restic-backup.timer
[Timer]
OnCalendar=*-*-* 02:30:00
RandomizedDelaySec=20m
Persistent=true
[Install]
WantedBy=timers.target
A second restic-prune.timer runs restic prune monthly (OnCalendar=*-*-01 04:00:00).
Restore procedure
The whole point. From the target host (or any host with the repo creds):
# load repo + B2 creds without echoing them
set -a; . /etc/restic/restic-env; set +a
restic snapshots # list; note the snapshot ID or use 'latest'
# restore specific paths to a scratch dir (never restore in place blindly)
restic restore latest --target /tmp/restore \
--include /var/backups/restic-staging \
--include /var/www/html/wp-config.php
# verify before doing anything with it
ls -la /tmp/restore/var/backups/restic-staging/
head -1 /tmp/restore/var/backups/restic-staging/mysql-wordpress.sql # "-- MySQL dump 10.13 ..."
To recover a database, restore the dump then load it: mysql <db> < mysql-<db>.sql, pg_restore -d <db> pg-<db>.dump, or copy the SQLite file back. Test restores periodically — a backup you've never restored is a hope, not a backup. Restore the highest-stakes data (password manager, mail) first in any drill.
Adding a host
-
Add it to the
backupsinventory group. -
Give it a
host_varsscope — which DBs to dump and which paths to back up:restic_backup_oncalendar: "*-*-* 02:40:00" # stagger restic_mysql_dbs: [castopod_db] restic_paths: [/var/www/html/castopod] restic_excludes: [/var/www/html/castopod/public/media] # already offsite -
Run the playbook against that host. The role installs restic, deploys the script + units,
restic inits the repo if absent, and enables the timers.
Gotchas & Notes
RESTIC_CACHE_DIRis mandatory under systemd. systemd services run with no$HOME, so restic can't find its cache and warns "unable to locate cache directory: neither $XDG_CACHE_HOME nor $HOME are defined" — and re-reads every file each run (no incremental). Point it at/var/cache/resticin the unit.sqlite3may not be installed. A host that runs a SQLite-backed app (e.g. a bot) often lacks thesqlite3/sqliteCLI. Install it whererestic_sqlite_pathsis set, or the.backupstep fails.- Docker DB password env-var names vary. Don't assume: the MariaDB image may use
MYSQL_ROOT_PASSWORD(notMARIADB_ROOT_PASSWORD), and a Postgres container's superuser is whateverPOSTGRES_USERis set to — reference"$POSTGRES_USER"rather than hardcodingpostgres. Check withdocker exec <c> sh -c 'env | grep -oE "^(MYSQL|MARIADB|POSTGRES)_[A-Z_]*"'(name only). - B2 key needs delete capability. Otherwise
forget/prunefail. Scope the key to the bucket; reach for per-hostnamePrefix-restricted keys for blast-radius isolation. - Exclude data that's already offsite. Media already synced to object storage (S3/B2 via the app or
rclone) should be--excluded so you don't pay to store it twice. - First upload is slow, the rest are fast. The initial snapshot reads and uploads everything; subsequent runs only ship changed blocks. For a large first run, fire it detached and watch from a transient unit that emails you on completion.
- Keep secrets out of git. The repo password and B2 key belong in an Ansible vault (committed encrypted), referenced into the role — never in plaintext vars.