- Fixed 4 broken markdown links (bad relative paths in See Also sections) - Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment) - Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3) - Converted 67 Obsidian wikilinks to relative markdown links or plain text - Added YAML frontmatter to 35 articles missing it entirely - Completed frontmatter on 8 articles with missing fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1.7 KiB
title, domain, category, tags, status, created, updated
| title | domain | category | tags | status | created | updated | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| rmlint — Extreme Duplicate File Scanning | opensource | productivity |
|
published | 2026-04-02 | 2026-04-02 |
rmlint — Extreme Duplicate File Scanning
Problem
Over time, backups and media collections can accumulate massive amounts of duplicate data. Traditional duplicate finders are often slow and limited in how they handle results. On MajorRAID, I identified ~4.0 TB (113,584 files) of duplicate data across three different storage points.
Solution
rmlint is an extremely fast tool for finding (and optionally removing) duplicates. It is significantly faster than fdupes or rdfind because it uses a multi-stage approach to avoid unnecessary hashing.
1. Installation (Fedora)
sudo dnf install rmlint
2. Scanning Multiple Directories
To scan for duplicates across multiple mount points and compare them:
rmlint /majorstorage /majorRAID /mnt/usb
This will generate a script named rmlint.sh and a summary of the findings.
3. Reviewing Results
DO NOT run the generated script without reviewing it first. You can use the summary to see which paths contain the most duplicates:
# View the summary
cat rmlint.json | jq .
4. Advanced Usage: Finding Duplicates by Hash Only
If you suspect duplicates with different filenames:
rmlint --hidden --hard-links /path/to/search
5. Repurposing Storage
After scanning and clearing duplicates, you can reclaim significant space. In my case, this was the first step in repurposing a 12TB USB drive as a SnapRAID parity drive.
Maintenance
Run a scan monthly or before any major storage consolidation project.