--- title: "rsync Backup Patterns" domain: selfhosting category: storage-backup tags: [rsync, backup, linux, storage, automation] status: published created: 2026-03-08 updated: 2026-03-08 --- # rsync Backup Patterns rsync is the tool for moving files on Linux. Fast, resumable, bandwidth-efficient — it only transfers what changed. For local and remote backups, it's what I reach for first. ## The Short Answer ```bash # Sync source to destination (local) rsync -av /source/ /destination/ # Sync to remote server rsync -avz /source/ user@server:/destination/ # Dry run first — see what would change rsync -avnP /source/ /destination/ ``` ## Core Flags | Flag | What it does | |---|---| | `-a` | Archive mode: preserves permissions, timestamps, symlinks, owner/group. Use this almost always. | | `-v` | Verbose output — shows files being transferred. | | `-z` | Compress during transfer. Useful over slow connections, overhead on fast LAN. | | `-P` | Progress + partial transfers (resumes interrupted transfers). | | `-n` | Dry run — shows what would happen without doing it. | | `--delete` | Removes files from destination that no longer exist in source. Makes it a true mirror. | | `--exclude` | Exclude patterns. | | `--bwlimit` | Limit bandwidth in KB/s. | ## Local Backup ```bash # Basic sync rsync -av /home/major/ /backup/home/ # Mirror — destination matches source exactly (deletes removed files) rsync -av --delete /home/major/ /backup/home/ # Exclude directories rsync -av --exclude='.cache' --exclude='Downloads' /home/major/ /backup/home/ # Multiple excludes from a file rsync -av --exclude-from=exclude.txt /home/major/ /backup/home/ ``` **The trailing slash matters:** - `/source/` — sync the contents of source into destination - `/source` — sync the source directory itself into destination (creates `/destination/source/`) Almost always want the trailing slash on the source. ## Remote Backup ```bash # Local → remote rsync -avz /home/major/ user@server:/backup/major/ # Remote → local (pull backup) rsync -avz user@server:/var/data/ /local/backup/data/ # Specify SSH port or key rsync -avz -e "ssh -p 2222 -i ~/.ssh/id_ed25519" /source/ user@server:/dest/ ``` ## Incremental Backups with Hard Links The `--link-dest` pattern creates space-efficient incremental backups. Each backup looks like a full copy but only stores changed files — unchanged files are hard links to previous versions. ```bash #!/usr/bin/env bash set -euo pipefail BACKUP_DIR="/backup" SOURCE="/home/major" DATE="$(date +%Y-%m-%d)" LATEST="${BACKUP_DIR}/latest" DEST="${BACKUP_DIR}/${DATE}" rsync -av --delete \ --link-dest="$LATEST" \ "$SOURCE/" \ "$DEST/" # Update the 'latest' symlink rm -f "$LATEST" ln -s "$DEST" "$LATEST" ``` Each dated directory looks like a complete backup. Storage is only used for changed files. You can delete any dated directory without affecting others. ## Backup Script with Logging ```bash #!/usr/bin/env bash set -euo pipefail SOURCE="/home/major/" DEST="/backup/home/" LOG="/var/log/rsync-backup.log" log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG"; } log "Backup started" rsync -av --delete \ --exclude='.cache' \ --exclude='Downloads' \ --log-file="$LOG" \ "$SOURCE" "$DEST" log "Backup complete" ``` Run via cron: ```bash # Daily at 2am 0 2 * * * /usr/local/bin/backup.sh ``` Or systemd timer (preferred): ```ini # /etc/systemd/system/rsync-backup.timer [Unit] Description=Daily rsync backup [Timer] OnCalendar=daily Persistent=true [Install] WantedBy=timers.target ``` ```bash sudo systemctl enable --now rsync-backup.timer ``` ## Cold Storage — AWS Glacier Deep Archive rsync handles local and remote backups, but for true offsite cold storage — disaster recovery, archival copies you rarely need to retrieve — AWS Glacier Deep Archive is the cheapest option at ~$1/TB/month. Upload files directly to an S3 bucket with the `DEEP_ARCHIVE` storage class: ```bash # Single file aws s3 cp backup.tar.gz s3://your-bucket/ --storage-class DEEP_ARCHIVE # Entire directory aws s3 sync /backup/offsite/ s3://your-bucket/offsite/ --storage-class DEEP_ARCHIVE ``` **When to use it:** Long-term backups you'd only need in a disaster scenario — media archives, yearly snapshots, irreplaceable data. Not for anything you'd need to restore quickly. **Retrieval tradeoffs:** - **Standard retrieval:** 12 hours, cheapest restore cost - **Bulk retrieval:** Up to 48 hours, even cheaper - **Expedited:** Not available for Deep Archive — if you need faster access, use regular Glacier or S3 Infrequent Access **In the MajorsHouse backup strategy**, rsync handles the daily local and cross-host backups. Glacier Deep Archive is the final tier — offsite, durable, cheap, and slow to retrieve by design. A good backup plan has both. ## Gotchas & Notes - **Test with `--dry-run` first.** Especially when using `--delete`. See what would be removed before actually removing it. - **`--delete` is destructive.** It removes files from the destination that don't exist in the source. That's the point, but know what you're doing. - **Large files and slow connections:** Add `-P` for progress and partial transfer resume. An interrupted rsync picks up where it left off. - **For network backups to untrusted locations**, consider using rsync over SSH + encryption at rest. rsync over SSH handles transit encryption; storage encryption is separate. - **rsync vs Restic:** rsync is fast and simple. Restic gives you deduplication, encryption, and multiple backend support (S3, B2, etc.). For local backups, rsync. For offsite with encryption needs, Restic. ## See Also - [self-hosting-starter-guide](../docker/self-hosting-starter-guide.md) - [bash-scripting-patterns](../../01-linux/shell-scripting/bash-scripting-patterns.md)