- Fixed 4 broken markdown links (bad relative paths in See Also sections) - Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment) - Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3) - Converted 67 Obsidian wikilinks to relative markdown links or plain text - Added YAML frontmatter to 35 articles missing it entirely - Completed frontmatter on 8 articles with missing fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
64 lines
1.7 KiB
Markdown
64 lines
1.7 KiB
Markdown
---
|
|
title: "rmlint — Extreme Duplicate File Scanning"
|
|
domain: opensource
|
|
category: productivity
|
|
tags: [rmlint, duplicates, storage, cleanup, linux]
|
|
status: published
|
|
created: 2026-04-02
|
|
updated: 2026-04-02
|
|
---
|
|
# rmlint — Extreme Duplicate File Scanning
|
|
|
|
## Problem
|
|
|
|
Over time, backups and media collections can accumulate massive amounts of duplicate data. Traditional duplicate finders are often slow and limited in how they handle results. On MajorRAID, I identified **~4.0 TB (113,584 files)** of duplicate data across three different storage points.
|
|
|
|
## Solution
|
|
|
|
`rmlint` is an extremely fast tool for finding (and optionally removing) duplicates. It is significantly faster than `fdupes` or `rdfind` because it uses a multi-stage approach to avoid unnecessary hashing.
|
|
|
|
### 1. Installation (Fedora)
|
|
|
|
```bash
|
|
sudo dnf install rmlint
|
|
```
|
|
|
|
### 2. Scanning Multiple Directories
|
|
|
|
To scan for duplicates across multiple mount points and compare them:
|
|
|
|
```bash
|
|
rmlint /majorstorage /majorRAID /mnt/usb
|
|
```
|
|
|
|
This will generate a script named `rmlint.sh` and a summary of the findings.
|
|
|
|
### 3. Reviewing Results
|
|
|
|
**DO NOT** run the generated script without reviewing it first. You can use the summary to see which paths contain the most duplicates:
|
|
|
|
```bash
|
|
# View the summary
|
|
cat rmlint.json | jq .
|
|
```
|
|
|
|
### 4. Advanced Usage: Finding Duplicates by Hash Only
|
|
|
|
If you suspect duplicates with different filenames:
|
|
|
|
```bash
|
|
rmlint --hidden --hard-links /path/to/search
|
|
```
|
|
|
|
### 5. Repurposing Storage
|
|
|
|
After scanning and clearing duplicates, you can reclaim significant space. In my case, this was the first step in repurposing a 12TB USB drive as a **SnapRAID parity drive**.
|
|
|
|
---
|
|
|
|
## Maintenance
|
|
|
|
Run a scan monthly or before any major storage consolidation project.
|
|
|
|
---
|