diff --git a/05-troubleshooting/time-machine-apfs-orphaned-previous-blocks-backup.md b/05-troubleshooting/time-machine-apfs-orphaned-previous-blocks-backup.md new file mode 100644 index 0000000..40d04d3 --- /dev/null +++ b/05-troubleshooting/time-machine-apfs-orphaned-previous-blocks-backup.md @@ -0,0 +1,120 @@ +--- +title: "Time Machine: Orphaned APFS .previous Folder Blocks All Backups" +domain: troubleshooting +category: general +tags: [macos, time-machine, apfs, backup, fsck, disk-utility] +status: published +created: 2026-06-18 +updated: 2026-06-18 +--- +# Time Machine: Orphaned APFS `.previous` Folder Blocks All Backups + +## Overview +On an APFS Time Machine destination, an interrupted backup can leave behind an orphaned staging folder named `.previous` (plus a matching, uncatalogued APFS snapshot). Every subsequent backup reads that folder during *FindingChanges*, hits a metadata-type mismatch, and aborts — so backups silently stop running. macOS shows only a generic "**Time Machine couldn't complete the backup … An unknown error occurred.**" + +The trap: because the orphan is **not in Time Machine's catalog** and the destination is OS-protected, every obvious removal tool (`rm`, `chmod`, `tmutil delete`, `diskutil deleteSnapshot`) refuses it. The clean fix is **First Aid (`fsck_apfs`)**, which has authority over the volume and clears the orphaned snapshot. + +## Symptoms +- "Time Machine couldn't complete the backup to '' — An unknown error occurred." +- Backups haven't run since around the time of an interrupted/cancelled backup. +- The destination disk is mounted and has plenty of free space (not full, not disconnected). +- `tmutil status` cycles through `Starting` / `FindingChanges` and never reaches `Copying`. + +## Root Cause +`backupd` logs the real error on a loop (every ~15 s): + +```bash +log show --predicate 'subsystem == "com.apple.TimeMachine"' --last 10m --style compact \ + | grep -iE 'previous|error' +``` +``` +[TMStructure] Expected SnapshotInProgressContainer metadata type but found APFSBackup + metadata type at URL '...//2026-06-17-172230.previous/' +``` + +An earlier backup was interrupted mid-run. It left two orphans tied to that timestamp, **neither registered in Time Machine's backup catalog**: + +1. A staging directory `.previous` on the destination volume. +2. A matching APFS snapshot `com.apple.TimeMachine..backup`. + +Time Machine expects the staging folder to be a `SnapshotInProgressContainer` but finds completed-backup (`APFSBackup`) metadata, so it bails before copying anything. + +> **Ignore the surrounding log noise.** `com.apple.backupd.sandbox.xpc: connection invalid`, `Mountpoint '…' is still valid`, and `missingName` on `/System/Volumes/Data/home` are all normal on a healthy backup — flagged `E` but harmless. The only line that matters is the `SnapshotInProgressContainer` mismatch. + +## Diagnosis + +Confirm the disk is healthy (not the problem) and locate the orphan: + +```bash +tmutil status # stuck in Starting/FindingChanges, never Copying +df -h | grep -i "" # mounted, plenty free +diskutil apfs listSnapshots # note the highest/last snapshot timestamp +``` + +If `listSnapshots` shows a final snapshot whose timestamp matches the `.previous` folder in the error, that's the orphaned pair. + +## Why the Obvious Tools Fail + +Do **not** burn time trying to force the folder out — here's what each tool does and why it refuses: + +| Command | Result | Reason | +|---|---|---| +| `sudo rm -rf …/.previous` | `Operation not permitted` | TM applies a `group:everyone deny delete` ACL that overrides root. | +| `sudo chmod -RN …/.previous` | runs for minutes, then fails | A `.previous` folder is a **full copy of the entire Mac filesystem**; `-R` walks the whole tree and can't clear ACLs on the SIP-`restricted` system files inside (`/usr/bin/sh`, frameworks, keymaps). `rm` then hits the same wall. | +| `sudo tmutil delete -p …/.previous` | `Invalid deletion target (error 22)` | Not a registered backup. | +| `sudo tmutil delete -t ` | `error 2 (No such file)` | No catalog entry for that timestamp. | +| `sudo diskutil apfs deleteSnapshot -uuid ` | `Not a valid APFS Snapshot UUID` | TM-managed snapshot; diskutil won't remove it directly. | + +> **If you started a `chmod -R` and killed it:** the live system is unaffected — `chmod -R` does not follow symlinks out of the backup tree. Verify with `ls -lde ~/Desktop` (normal ACLs = untouched). Stop a runaway with `sudo pkill -f '.previous'`. + +## Fix — Run First Aid (`fsck_apfs`) + +First Aid runs with full authority over the volume and clears the orphaned snapshot, which defuses the `.previous` folder's metadata mismatch. + +```bash +# 1. Stop the looping backup +sudo tmutil stopbackup + +# 2. Verify the destination volume (live mode is fine; read-only check) +sudo diskutil verifyVolume +# or: Disk Utility → View → Show All Devices → select the TM volume → First Aid → Run +``` + +`verifyVolume` enumerates and validates every snapshot; the verify/remount cycle purges the orphaned in-progress snapshot. Expected result: + +``` +The volume appears to be OK +File system check exit code is 0 +``` + +Confirm the orphan snapshot is gone (count drops by one; the matching timestamp no longer appears): + +```bash +diskutil apfs listSnapshots +``` + +Then restart and watch it succeed: + +```bash +sudo tmutil startbackup --auto +tmutil status # should reach BackupPhase = Copying with no SnapshotInProgressContainer errors +``` + +If `verifyVolume` reports problems rather than "appears to be OK", run the repair (it must unmount the volume): + +```bash +sudo diskutil repairVolume +``` + +## Notes +- The first backup after the fix is often a large catch-up (hundreds of GB) because the chain was broken — let it finish; it returns to quick hourly increments afterward. +- The inert `.previous` **folder** may still sit on the volume after the fix. Time Machine now ignores it, so it's not blocking — but it consumes space. Removing it cleanly requires booting to **Recovery Mode**, `csrutil disable`, `rm -rf` the folder, then `csrutil enable` — only worth it to reclaim the space. +- Time Machine identifies its destination by `DestinationID` (a UUID), not the volume name, so renaming the disk later is safe. +- Interrupted backups are more likely on flaky USB-SATA bridge enclosures (e.g. some WD My Passport units) whose slow sleep/wake transitions can drop the drive mid-backup. + +## Tags +`macos` `time-machine` `apfs` `backup` `fsck-apfs` `disk-utility` `snapshot` `first-aid` + +## See Also +- [SnapRAID & MergerFS Storage Setup](../01-linux/storage/snapraid-mergerfs-setup.md) +- MajorMac Incident Log (2026-06-18) — the originating incident diff --git a/SUMMARY.md b/SUMMARY.md index 30e7597..8f5edc8 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -122,6 +122,7 @@ updated: 2026-05-15T09:00 * [rsync over Tailscale: Hung in TCP Teardown After Transfer Completes](05-troubleshooting/networking/rsync-tailscale-teardown-stall.md) * [iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators](05-troubleshooting/networking/tailscale-status-json-hostname-localhost-ios.md) * [macOS: Repeating Alert Tone from Mirrored iPhone Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md) + * [Time Machine: Orphaned APFS `.previous` Folder Blocks All Backups](05-troubleshooting/time-machine-apfs-orphaned-previous-blocks-backup.md) * [OBS Studio: Stale Script Paths After Windows Profile Rename](05-troubleshooting/obs-stale-script-paths-after-windows-profile-rename.md) * [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md) * [Logwatch Falsely Reports 'No freshclam updates' in ClamAV Daemon Mode](05-troubleshooting/security/freshclam-logwatch-false-no-updates.md)