WSyncing from MajorMaciki expansion (Phase 10): 3 new articles and updated indices

This commit is contained in:
Marcus Summers
2026-03-13 12:00:42 -04:00
parent 70d9657b7f
commit 64df4b8cfb
7 changed files with 238 additions and 21 deletions

View File

@@ -0,0 +1,74 @@
# SnapRAID & MergerFS Storage Setup
## Problem
Managing a collection of mismatched hard drives as a single pool while maintaining data redundancy (parity) without the overhead or risk of a traditional RAID 5/6 array.
## Solution
A combination of **MergerFS** for pooling and **SnapRAID** for parity. This is ideal for "mostly static" media storage (like MajorRAID) where files aren't changing every second.
### 1. Concepts
- **MergerFS:** A FUSE-based union filesystem. It takes multiple drives/folders and presents them as a single mount point. It does NOT provide redundancy.
- **SnapRAID:** A backup/parity tool for disk arrays. It creates parity information on a dedicated drive. It is NOT real-time (you must run `snapraid sync`).
### 2. Implementation Strategy
1. **Clean the Pool:** Use `rmlint` to clear duplicates and reclaim space.
2. **Identify the Parity Drive:** Choose your largest drive (or one equal to the largest data drive) to hold the parity information. In my setup, `/mnt/usb` (sdc) was cleared of 4TB of duplicates to be repurposed for this.
3. **Configure MergerFS:** Pool the data drives (e.g., `/mnt/disk1`, `/mnt/disk2`) into `/storage`.
4. **Configure SnapRAID:** Point SnapRAID to the data drives and the parity drive.
### 3. MergerFS Config (/etc/fstab)
```fstab
# Example MergerFS pool
/mnt/disk*:/mnt/usb-data /storage fuse.mergerfs defaults,allow_other,cache.files=off,use_ino,category.create=mfs,minfreespace=20G,fsname=mergerfsPool 0 0
```
### 4. SnapRAID Config (/etc/snapraid.conf)
```conf
# Parity file location
parity /mnt/parity/snapraid.parity
# Data drives
content /var/snapraid/snapraid.content
content /mnt/disk1/.snapraid.content
content /mnt/disk2/.snapraid.content
data d1 /mnt/disk1/
data d2 /mnt/disk2/
# Exclusions
exclude /lost+found/
exclude /tmp/
exclude .DS_Store
```
---
## Maintenance
### SnapRAID Sync
Run this daily (via cron) or after adding large amounts of data:
```bash
snapraid sync
```
### SnapRAID Scrub
Run this weekly to check for bitrot:
```bash
snapraid scrub
```
---
## Tags
#snapraid #mergerfs #linux #storage #homelab #raid

15
03-opensource/index.md Normal file
View File

@@ -0,0 +1,15 @@
# 📂 Open Source & Alternatives
A curated collection of my favorite open-source tools and privacy-respecting alternatives to mainstream software.
## 🚀 Productivity
- [rmlint: Duplicate File Scanning](productivity/rmlint-duplicate-scanning.md)
## 🛠️ Development Tools
- *Coming soon*
## 🎨 Media & Creative
- *Coming soon*
## 🔐 Privacy & Security
- *Coming soon*

View File

@@ -0,0 +1,58 @@
# rmlint — Extreme Duplicate File Scanning
## Problem
Over time, backups and media collections can accumulate massive amounts of duplicate data. Traditional duplicate finders are often slow and limited in how they handle results. On MajorRAID, I identified **~4.0 TB (113,584 files)** of duplicate data across three different storage points.
## Solution
`rmlint` is an extremely fast tool for finding (and optionally removing) duplicates. It is significantly faster than `fdupes` or `rdfind` because it uses a multi-stage approach to avoid unnecessary hashing.
### 1. Installation (Fedora)
```bash
sudo dnf install rmlint
```
### 2. Scanning Multiple Directories
To scan for duplicates across multiple mount points and compare them:
```bash
rmlint /majorstorage /majorRAID /mnt/usb
```
This will generate a script named `rmlint.sh` and a summary of the findings.
### 3. Reviewing Results
**DO NOT** run the generated script without reviewing it first. You can use the summary to see which paths contain the most duplicates:
```bash
# View the summary
cat rmlint.json | jq .
```
### 4. Advanced Usage: Finding Duplicates by Hash Only
If you suspect duplicates with different filenames:
```bash
rmlint --hidden --hard-links /path/to/search
```
### 5. Repurposing Storage
After scanning and clearing duplicates, you can reclaim significant space. In my case, this was the first step in repurposing a 12TB USB drive as a **SnapRAID parity drive**.
---
## Maintenance
Run a scan monthly or before any major storage consolidation project.
---
## Tags
#rmlint #linux #storage #cleanup #duplicates

View File

@@ -0,0 +1,58 @@
# Qwen2.5-14B OOM on RTX 3080 Ti (12GB)
## Problem
When attempting to run or fine-tune **Qwen2.5-14B** on an NVIDIA RTX 3080 Ti with 12GB of VRAM, the process fails with an Out of Memory (OOM) error:
```
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate X GiB (GPU 0; 12.00 GiB total capacity; Y GiB already allocated; Z GiB free; ...)
```
The 12GB VRAM limit is hit during the initial model load or immediately upon starting the first training step.
## Root Causes
1. **Model Size:** A 14B parameter model in FP16/BF16 requires ~28GB of VRAM just for the weights.
2. **Context Length:** High context lengths (e.g., 4096+) significantly increase VRAM usage during training.
3. **Training Overhead:** Even with QLoRA (4-bit quantization), the overhead of gradients, optimizer states, and activations can exceed 12GB for a 14B model.
---
## Solutions
### 1. Pivot to a 7B Model (Recommended)
For a 12GB GPU, a 7B parameter model (like **Qwen2.5-7B-Instruct**) is the sweet spot. It provides excellent performance while leaving enough VRAM for high context lengths and larger batch sizes.
- **VRAM Usage (7B QLoRA):** ~6-8GB
- **Pros:** Stable, fast, supports long context.
- **Cons:** Slightly lower reasoning capability than 14B.
### 2. Aggressive Quantization
If you MUST run 14B, use 4-bit quantization (GGUF or EXL2) for inference only. Training 14B on 12GB is not reliably possible even with extreme offloading.
```bash
# Example Ollama run (uses 4-bit quantization by default)
ollama run qwen2.5:14b
```
### 3. Training Optimizations (if attempting 14B)
If you have no choice but to try 14B training:
- Set `max_seq_length` to 512 or 1024.
- Use `Unsloth` (it is highly memory-efficient).
- Enable `gradient_checkpointing`.
- Set `per_device_train_batch_size = 1`.
---
## Maintenance
Keep your NVIDIA drivers and CUDA toolkit updated. On Windows (MajorRig), ensure WSL2 has sufficient memory allocation in `.wslconfig`.
---
## Tags
#gpu #cuda #oom #qwen #majortwin #llm #fine-tuning

View File

@@ -2,8 +2,17 @@
Practical fixes for common Linux, networking, and application problems.
- [ISP SNI Filtering with Caddy](isp-sni-filtering-caddy.md)
- [Obsidian Cache Hang Recovery](obsidian-cache-hang-recovery.md)
- [yt-dlp Fedora JS Challenge](yt-dlp-fedora-js-challenge.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
## 🖥️ GPU & AI
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](gpu-display/qwen-14b-oom-3080ti.md)
## 🌐 Networking & Web
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md)
- [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md)
- [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md)
## 📦 Docker & Systems
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
## 📝 Application Specific
- [Obsidian Vault Recovery — Loading Cache Hang](obsidian-cache-hang-recovery.md)

View File

@@ -1,11 +1,17 @@
* [Home](index.md)
* [Linux & Sysadmin](01-linux/index.md)
* [Introduction](01-linux/index.md)
* [Storage: SnapRAID & MergerFS Setup](01-linux/storage/snapraid-mergerfs-setup.md)
* [Self-Hosting](02-selfhosting/index.md)
* [Introduction](02-selfhosting/index.md)
* [Open Source & Alternatives](03-opensource/index.md)
* [rmlint: Duplicate File Scanning](03-opensource/productivity/rmlint-duplicate-scanning.md)
* [Streaming](04-streaming/index.md)
* [Introduction](04-streaming/index.md)
* [Troubleshooting](05-troubleshooting/index.md)
* [ISP SNI Filtering & Caddy](05-troubleshooting/isp-sni-filtering-caddy.md)
* [yt-dlp YouTube JS Challenge Fix](05-troubleshooting/yt-dlp-fedora-js-challenge.md)
* [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md)
* [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md)
* [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md)
* [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md)
* [MajorWiki Setup & Publishing Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md)

View File

@@ -3,17 +3,17 @@
> A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin.
>
> **Last updated:** 2026-03-13
> **Article count:** 23
> **Article count:** 27
## Domains
| Domain | Folder | Articles |
|---|---|---|
| 🐧 Linux & Sysadmin | `01-linux/` | 8 |
| 🐧 Linux & Sysadmin | `01-linux/` | 9 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 8 |
| 🔓 Open Source Tools | `03-opensource/` | 0 |
| 🔓 Open Source Tools | `03-opensource/` | 1 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 1 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 6 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 8 |
---
@@ -35,6 +35,9 @@
- [Ansible Getting Started](01-linux/shell-scripting/ansible-getting-started.md) — inventory, ad-hoc commands, playbooks, handlers, roles
- [Bash Scripting Patterns](01-linux/shell-scripting/bash-scripting-patterns.md) — set -euo pipefail, logging, error handling, argument parsing, common patterns
### Storage
- [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md) — Pooling mismatched drives and adding parity on Linux
### Distro-Specific
- [Linux Distro Guide for Beginners](01-linux/distro-specific/linux-distro-guide-beginners.md) — Ubuntu recommendation, distro comparison, desktop environments
- [WSL2 Instance Migration to Fedora 43](01-linux/distro-specific/wsl2-instance-migration-fedora43.md) — moving WSL2 VHDX from C: to another drive
@@ -67,7 +70,8 @@
## 🔓 Open Source Tools
*(Articles coming)*
### Productivity
- [rmlint: Duplicate File Scanning](03-opensource/productivity/rmlint-duplicate-scanning.md) — extremely fast duplicate file finding and storage reclamation
---
@@ -84,6 +88,7 @@
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) — fixing docker.socket, SELinux port blocks, and httpd_can_network_connect after reboot
- [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md) — troubleshooting why wiki.majorshouse.com was blocked by Google Fiber
- [Obsidian Cache Hang Recovery](05-troubleshooting/obsidian-cache-hang-recovery.md) — resolving "Loading cache" hang in Obsidian by cleaning Electron app data and ML artifacts
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) — fixes and alternatives when hitting VRAM limits during fine-tuning
- [yt-dlp JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md) — fixing YouTube JS challenge solver errors and missing formats on Fedora
- [MajorWiki Setup & Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) — setting up MajorWiki and the Obsidian → Gitea → MkDocs publishing pipeline
@@ -93,21 +98,14 @@
| Date | Article | Domain |
|---|---|---|
| 2026-03-13 | [rmlint: Duplicate File Scanning](03-opensource/productivity/rmlint-duplicate-scanning.md) | Open Source |
| 2026-03-13 | [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md) | Linux |
| 2026-03-13 | [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) | Troubleshooting |
| 2026-03-13 | [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md) | Troubleshooting |
| 2026-03-12 | [Docker & Caddy Recovery After Reboot](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) | Troubleshooting |
| 2026-03-11 | [MajorWiki Setup & Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) | Troubleshooting |
| 2026-03-11 | [Obsidian Cache Hang Recovery](05-troubleshooting/obsidian-cache-hang-recovery.md) | Troubleshooting |
| 2026-03-11 | [yt-dlp JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md) | Troubleshooting |
| 2026-03-08 | [OBS Studio Setup & Encoding](04-streaming/obs/obs-studio-setup-encoding.md) | Streaming |
| 2026-03-08 | [Linux File Permissions](01-linux/files-permissions/linux-file-permissions.md) | Linux |
| 2026-03-08 | [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) | Self-Hosting |
| 2026-03-08 | [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md) | Self-Hosting |
| 2026-03-08 | [Package Management Reference](01-linux/packages/package-management-reference.md) | Linux |
| 2026-03-08 | [Bash Scripting Patterns](01-linux/shell-scripting/bash-scripting-patterns.md) | Linux |
| 2026-03-08 | [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md) | Self-Hosting |
| 2026-03-08 | [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) | Linux |
| 2026-03-08 | [Ansible Getting Started](01-linux/shell-scripting/ansible-getting-started.md) | Linux |
| 2026-03-08 | [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md) | Self-Hosting |
---
@@ -119,6 +117,5 @@
| Docker Compose networking deep dive | Self-Hosting | High | No |
| Troubleshooting NVIDIA on Linux | Troubleshooting | Medium | No |
| Pi-hole setup and local DNS | Self-Hosting | Medium | No |
| OBS audio routing on Linux (PipeWire) | Streaming | Medium | No |
| Nextcloud setup with Docker | Self-Hosting | Medium | No |
| tmux basics | Linux | Low | No |