Files
MajorWiki/02-selfhosting/storage-backup/rsync-backup-patterns.md
MajorLinux 6592eb4fea wiki: audit fixes — broken links, wikilinks, frontmatter, stale content (66 files)
- Fixed 4 broken markdown links (bad relative paths in See Also sections)
- Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment)
- Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3)
- Converted 67 Obsidian wikilinks to relative markdown links or plain text
- Added YAML frontmatter to 35 articles missing it entirely
- Completed frontmatter on 8 articles with missing fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:16:29 -04:00

186 lines
5.7 KiB
Markdown

---
title: "rsync Backup Patterns"
domain: selfhosting
category: storage-backup
tags: [rsync, backup, linux, storage, automation]
status: published
created: 2026-03-08
updated: 2026-03-08
---
# rsync Backup Patterns
rsync is the tool for moving files on Linux. Fast, resumable, bandwidth-efficient — it only transfers what changed. For local and remote backups, it's what I reach for first.
## The Short Answer
```bash
# Sync source to destination (local)
rsync -av /source/ /destination/
# Sync to remote server
rsync -avz /source/ user@server:/destination/
# Dry run first — see what would change
rsync -avnP /source/ /destination/
```
## Core Flags
| Flag | What it does |
|---|---|
| `-a` | Archive mode: preserves permissions, timestamps, symlinks, owner/group. Use this almost always. |
| `-v` | Verbose output — shows files being transferred. |
| `-z` | Compress during transfer. Useful over slow connections, overhead on fast LAN. |
| `-P` | Progress + partial transfers (resumes interrupted transfers). |
| `-n` | Dry run — shows what would happen without doing it. |
| `--delete` | Removes files from destination that no longer exist in source. Makes it a true mirror. |
| `--exclude` | Exclude patterns. |
| `--bwlimit` | Limit bandwidth in KB/s. |
## Local Backup
```bash
# Basic sync
rsync -av /home/major/ /backup/home/
# Mirror — destination matches source exactly (deletes removed files)
rsync -av --delete /home/major/ /backup/home/
# Exclude directories
rsync -av --exclude='.cache' --exclude='Downloads' /home/major/ /backup/home/
# Multiple excludes from a file
rsync -av --exclude-from=exclude.txt /home/major/ /backup/home/
```
**The trailing slash matters:**
- `/source/` — sync the contents of source into destination
- `/source` — sync the source directory itself into destination (creates `/destination/source/`)
Almost always want the trailing slash on the source.
## Remote Backup
```bash
# Local → remote
rsync -avz /home/major/ user@server:/backup/major/
# Remote → local (pull backup)
rsync -avz user@server:/var/data/ /local/backup/data/
# Specify SSH port or key
rsync -avz -e "ssh -p 2222 -i ~/.ssh/id_ed25519" /source/ user@server:/dest/
```
## Incremental Backups with Hard Links
The `--link-dest` pattern creates space-efficient incremental backups. Each backup looks like a full copy but only stores changed files — unchanged files are hard links to previous versions.
```bash
#!/usr/bin/env bash
set -euo pipefail
BACKUP_DIR="/backup"
SOURCE="/home/major"
DATE="$(date +%Y-%m-%d)"
LATEST="${BACKUP_DIR}/latest"
DEST="${BACKUP_DIR}/${DATE}"
rsync -av --delete \
--link-dest="$LATEST" \
"$SOURCE/" \
"$DEST/"
# Update the 'latest' symlink
rm -f "$LATEST"
ln -s "$DEST" "$LATEST"
```
Each dated directory looks like a complete backup. Storage is only used for changed files. You can delete any dated directory without affecting others.
## Backup Script with Logging
```bash
#!/usr/bin/env bash
set -euo pipefail
SOURCE="/home/major/"
DEST="/backup/home/"
LOG="/var/log/rsync-backup.log"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG"; }
log "Backup started"
rsync -av --delete \
--exclude='.cache' \
--exclude='Downloads' \
--log-file="$LOG" \
"$SOURCE" "$DEST"
log "Backup complete"
```
Run via cron:
```bash
# Daily at 2am
0 2 * * * /usr/local/bin/backup.sh
```
Or systemd timer (preferred):
```ini
# /etc/systemd/system/rsync-backup.timer
[Unit]
Description=Daily rsync backup
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
```
```bash
sudo systemctl enable --now rsync-backup.timer
```
## Cold Storage — AWS Glacier Deep Archive
rsync handles local and remote backups, but for true offsite cold storage — disaster recovery, archival copies you rarely need to retrieve — AWS Glacier Deep Archive is the cheapest option at ~$1/TB/month.
Upload files directly to an S3 bucket with the `DEEP_ARCHIVE` storage class:
```bash
# Single file
aws s3 cp backup.tar.gz s3://your-bucket/ --storage-class DEEP_ARCHIVE
# Entire directory
aws s3 sync /backup/offsite/ s3://your-bucket/offsite/ --storage-class DEEP_ARCHIVE
```
**When to use it:** Long-term backups you'd only need in a disaster scenario — media archives, yearly snapshots, irreplaceable data. Not for anything you'd need to restore quickly.
**Retrieval tradeoffs:**
- **Standard retrieval:** 12 hours, cheapest restore cost
- **Bulk retrieval:** Up to 48 hours, even cheaper
- **Expedited:** Not available for Deep Archive — if you need faster access, use regular Glacier or S3 Infrequent Access
**In the MajorsHouse backup strategy**, rsync handles the daily local and cross-host backups. Glacier Deep Archive is the final tier — offsite, durable, cheap, and slow to retrieve by design. A good backup plan has both.
## Gotchas & Notes
- **Test with `--dry-run` first.** Especially when using `--delete`. See what would be removed before actually removing it.
- **`--delete` is destructive.** It removes files from the destination that don't exist in the source. That's the point, but know what you're doing.
- **Large files and slow connections:** Add `-P` for progress and partial transfer resume. An interrupted rsync picks up where it left off.
- **For network backups to untrusted locations**, consider using rsync over SSH + encryption at rest. rsync over SSH handles transit encryption; storage encryption is separate.
- **rsync vs Restic:** rsync is fast and simple. Restic gives you deduplication, encryption, and multiple backend support (S3, B2, etc.). For local backups, rsync. For offsite with encryption needs, Restic.
## See Also
- [self-hosting-starter-guide](../docker/self-hosting-starter-guide.md)
- [bash-scripting-patterns](../../01-linux/shell-scripting/bash-scripting-patterns.md)