From b59f6bb6b1564f14c46ac2bfdf995fc40db88b39 Mon Sep 17 00:00:00 2001 From: Marcus Summers Date: Fri, 13 Mar 2026 12:00:42 -0400 Subject: [PATCH] WSyncing from MajorMaciki expansion (Phase 10): 3 new articles and updated indices --- 01-linux/storage/snapraid-mergerfs-setup.md | 74 +++++++++++++++++++ 03-opensource/index.md | 15 ++++ .../productivity/rmlint-duplicate-scanning.md | 58 +++++++++++++++ .../gpu-display/qwen-14b-oom-3080ti.md | 58 +++++++++++++++ 05-troubleshooting/index.md | 17 ++++- SUMMARY.md | 8 +- index.md | 29 ++++---- 7 files changed, 238 insertions(+), 21 deletions(-) create mode 100644 01-linux/storage/snapraid-mergerfs-setup.md create mode 100644 03-opensource/index.md create mode 100644 03-opensource/productivity/rmlint-duplicate-scanning.md create mode 100644 05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md diff --git a/01-linux/storage/snapraid-mergerfs-setup.md b/01-linux/storage/snapraid-mergerfs-setup.md new file mode 100644 index 0000000..36a1aa9 --- /dev/null +++ b/01-linux/storage/snapraid-mergerfs-setup.md @@ -0,0 +1,74 @@ +# SnapRAID & MergerFS Storage Setup + +## Problem + +Managing a collection of mismatched hard drives as a single pool while maintaining data redundancy (parity) without the overhead or risk of a traditional RAID 5/6 array. + +## Solution + +A combination of **MergerFS** for pooling and **SnapRAID** for parity. This is ideal for "mostly static" media storage (like MajorRAID) where files aren't changing every second. + +### 1. Concepts + +- **MergerFS:** A FUSE-based union filesystem. It takes multiple drives/folders and presents them as a single mount point. It does NOT provide redundancy. +- **SnapRAID:** A backup/parity tool for disk arrays. It creates parity information on a dedicated drive. It is NOT real-time (you must run `snapraid sync`). + +### 2. Implementation Strategy + +1. **Clean the Pool:** Use `rmlint` to clear duplicates and reclaim space. +2. **Identify the Parity Drive:** Choose your largest drive (or one equal to the largest data drive) to hold the parity information. In my setup, `/mnt/usb` (sdc) was cleared of 4TB of duplicates to be repurposed for this. +3. **Configure MergerFS:** Pool the data drives (e.g., `/mnt/disk1`, `/mnt/disk2`) into `/storage`. +4. **Configure SnapRAID:** Point SnapRAID to the data drives and the parity drive. + +### 3. MergerFS Config (/etc/fstab) + +```fstab +# Example MergerFS pool +/mnt/disk*:/mnt/usb-data /storage fuse.mergerfs defaults,allow_other,cache.files=off,use_ino,category.create=mfs,minfreespace=20G,fsname=mergerfsPool 0 0 +``` + +### 4. SnapRAID Config (/etc/snapraid.conf) + +```conf +# Parity file location +parity /mnt/parity/snapraid.parity + +# Data drives +content /var/snapraid/snapraid.content +content /mnt/disk1/.snapraid.content +content /mnt/disk2/.snapraid.content + +data d1 /mnt/disk1/ +data d2 /mnt/disk2/ + +# Exclusions +exclude /lost+found/ +exclude /tmp/ +exclude .DS_Store +``` + +--- + +## Maintenance + +### SnapRAID Sync + +Run this daily (via cron) or after adding large amounts of data: + +```bash +snapraid sync +``` + +### SnapRAID Scrub + +Run this weekly to check for bitrot: + +```bash +snapraid scrub +``` + +--- + +## Tags + +#snapraid #mergerfs #linux #storage #homelab #raid diff --git a/03-opensource/index.md b/03-opensource/index.md new file mode 100644 index 0000000..4c3af23 --- /dev/null +++ b/03-opensource/index.md @@ -0,0 +1,15 @@ +# 📂 Open Source & Alternatives + +A curated collection of my favorite open-source tools and privacy-respecting alternatives to mainstream software. + +## 🚀 Productivity +- [rmlint: Duplicate File Scanning](productivity/rmlint-duplicate-scanning.md) + +## 🛠️ Development Tools +- *Coming soon* + +## 🎨 Media & Creative +- *Coming soon* + +## 🔐 Privacy & Security +- *Coming soon* diff --git a/03-opensource/productivity/rmlint-duplicate-scanning.md b/03-opensource/productivity/rmlint-duplicate-scanning.md new file mode 100644 index 0000000..629c0b5 --- /dev/null +++ b/03-opensource/productivity/rmlint-duplicate-scanning.md @@ -0,0 +1,58 @@ +# rmlint — Extreme Duplicate File Scanning + +## Problem + +Over time, backups and media collections can accumulate massive amounts of duplicate data. Traditional duplicate finders are often slow and limited in how they handle results. On MajorRAID, I identified **~4.0 TB (113,584 files)** of duplicate data across three different storage points. + +## Solution + +`rmlint` is an extremely fast tool for finding (and optionally removing) duplicates. It is significantly faster than `fdupes` or `rdfind` because it uses a multi-stage approach to avoid unnecessary hashing. + +### 1. Installation (Fedora) + +```bash +sudo dnf install rmlint +``` + +### 2. Scanning Multiple Directories + +To scan for duplicates across multiple mount points and compare them: + +```bash +rmlint /majorstorage /majorRAID /mnt/usb +``` + +This will generate a script named `rmlint.sh` and a summary of the findings. + +### 3. Reviewing Results + +**DO NOT** run the generated script without reviewing it first. You can use the summary to see which paths contain the most duplicates: + +```bash +# View the summary +cat rmlint.json | jq . +``` + +### 4. Advanced Usage: Finding Duplicates by Hash Only + +If you suspect duplicates with different filenames: + +```bash +rmlint --hidden --hard-links /path/to/search +``` + +### 5. Repurposing Storage + +After scanning and clearing duplicates, you can reclaim significant space. In my case, this was the first step in repurposing a 12TB USB drive as a **SnapRAID parity drive**. + +--- + +## Maintenance + +Run a scan monthly or before any major storage consolidation project. + +--- + +## Tags + +#rmlint #linux #storage #cleanup #duplicates diff --git a/05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md b/05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md new file mode 100644 index 0000000..23d812c --- /dev/null +++ b/05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md @@ -0,0 +1,58 @@ +# Qwen2.5-14B OOM on RTX 3080 Ti (12GB) + +## Problem + +When attempting to run or fine-tune **Qwen2.5-14B** on an NVIDIA RTX 3080 Ti with 12GB of VRAM, the process fails with an Out of Memory (OOM) error: + +``` +torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate X GiB (GPU 0; 12.00 GiB total capacity; Y GiB already allocated; Z GiB free; ...) +``` + +The 12GB VRAM limit is hit during the initial model load or immediately upon starting the first training step. + +## Root Causes + +1. **Model Size:** A 14B parameter model in FP16/BF16 requires ~28GB of VRAM just for the weights. +2. **Context Length:** High context lengths (e.g., 4096+) significantly increase VRAM usage during training. +3. **Training Overhead:** Even with QLoRA (4-bit quantization), the overhead of gradients, optimizer states, and activations can exceed 12GB for a 14B model. + +--- + +## Solutions + +### 1. Pivot to a 7B Model (Recommended) + +For a 12GB GPU, a 7B parameter model (like **Qwen2.5-7B-Instruct**) is the sweet spot. It provides excellent performance while leaving enough VRAM for high context lengths and larger batch sizes. + +- **VRAM Usage (7B QLoRA):** ~6-8GB +- **Pros:** Stable, fast, supports long context. +- **Cons:** Slightly lower reasoning capability than 14B. + +### 2. Aggressive Quantization + +If you MUST run 14B, use 4-bit quantization (GGUF or EXL2) for inference only. Training 14B on 12GB is not reliably possible even with extreme offloading. + +```bash +# Example Ollama run (uses 4-bit quantization by default) +ollama run qwen2.5:14b +``` + +### 3. Training Optimizations (if attempting 14B) + +If you have no choice but to try 14B training: +- Set `max_seq_length` to 512 or 1024. +- Use `Unsloth` (it is highly memory-efficient). +- Enable `gradient_checkpointing`. +- Set `per_device_train_batch_size = 1`. + +--- + +## Maintenance + +Keep your NVIDIA drivers and CUDA toolkit updated. On Windows (MajorRig), ensure WSL2 has sufficient memory allocation in `.wslconfig`. + +--- + +## Tags + +#gpu #cuda #oom #qwen #majortwin #llm #fine-tuning diff --git a/05-troubleshooting/index.md b/05-troubleshooting/index.md index 9bfc177..2ce9737 100644 --- a/05-troubleshooting/index.md +++ b/05-troubleshooting/index.md @@ -2,8 +2,17 @@ Practical fixes for common Linux, networking, and application problems. -- [ISP SNI Filtering with Caddy](isp-sni-filtering-caddy.md) -- [Obsidian Cache Hang Recovery](obsidian-cache-hang-recovery.md) -- [yt-dlp Fedora JS Challenge](yt-dlp-fedora-js-challenge.md) -- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md) +## 🖥️ GPU & AI +- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](gpu-display/qwen-14b-oom-3080ti.md) + +## 🌐 Networking & Web +- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md) +- [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md) +- [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md) + +## 📦 Docker & Systems - [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md) +- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md) + +## 📝 Application Specific +- [Obsidian Vault Recovery — Loading Cache Hang](obsidian-cache-hang-recovery.md) diff --git a/SUMMARY.md b/SUMMARY.md index bb74759..477ebe0 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -1,11 +1,17 @@ * [Home](index.md) * [Linux & Sysadmin](01-linux/index.md) - * [Introduction](01-linux/index.md) + * [Storage: SnapRAID & MergerFS Setup](01-linux/storage/snapraid-mergerfs-setup.md) * [Self-Hosting](02-selfhosting/index.md) * [Introduction](02-selfhosting/index.md) +* [Open Source & Alternatives](03-opensource/index.md) + * [rmlint: Duplicate File Scanning](03-opensource/productivity/rmlint-duplicate-scanning.md) * [Streaming](04-streaming/index.md) * [Introduction](04-streaming/index.md) * [Troubleshooting](05-troubleshooting/index.md) * [ISP SNI Filtering & Caddy](05-troubleshooting/isp-sni-filtering-caddy.md) + * [yt-dlp YouTube JS Challenge Fix](05-troubleshooting/yt-dlp-fedora-js-challenge.md) * [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) * [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md) + * [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md) + * [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) + * [MajorWiki Setup & Publishing Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) diff --git a/index.md b/index.md index 054cb62..de9c3ce 100644 --- a/index.md +++ b/index.md @@ -3,17 +3,17 @@ > A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin. > > **Last updated:** 2026-03-13 -> **Article count:** 23 +> **Article count:** 27 ## Domains | Domain | Folder | Articles | |---|---|---| -| 🐧 Linux & Sysadmin | `01-linux/` | 8 | +| 🐧 Linux & Sysadmin | `01-linux/` | 9 | | 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 8 | -| 🔓 Open Source Tools | `03-opensource/` | 0 | +| 🔓 Open Source Tools | `03-opensource/` | 1 | | 🎙️ Streaming & Podcasting | `04-streaming/` | 1 | -| 🔧 General Troubleshooting | `05-troubleshooting/` | 6 | +| 🔧 General Troubleshooting | `05-troubleshooting/` | 8 | --- @@ -35,6 +35,9 @@ - [Ansible Getting Started](01-linux/shell-scripting/ansible-getting-started.md) — inventory, ad-hoc commands, playbooks, handlers, roles - [Bash Scripting Patterns](01-linux/shell-scripting/bash-scripting-patterns.md) — set -euo pipefail, logging, error handling, argument parsing, common patterns +### Storage +- [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md) — Pooling mismatched drives and adding parity on Linux + ### Distro-Specific - [Linux Distro Guide for Beginners](01-linux/distro-specific/linux-distro-guide-beginners.md) — Ubuntu recommendation, distro comparison, desktop environments - [WSL2 Instance Migration to Fedora 43](01-linux/distro-specific/wsl2-instance-migration-fedora43.md) — moving WSL2 VHDX from C: to another drive @@ -67,7 +70,8 @@ ## 🔓 Open Source Tools -*(Articles coming)* +### Productivity +- [rmlint: Duplicate File Scanning](03-opensource/productivity/rmlint-duplicate-scanning.md) — extremely fast duplicate file finding and storage reclamation --- @@ -84,6 +88,7 @@ - [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) — fixing docker.socket, SELinux port blocks, and httpd_can_network_connect after reboot - [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md) — troubleshooting why wiki.majorshouse.com was blocked by Google Fiber - [Obsidian Cache Hang Recovery](05-troubleshooting/obsidian-cache-hang-recovery.md) — resolving "Loading cache" hang in Obsidian by cleaning Electron app data and ML artifacts +- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) — fixes and alternatives when hitting VRAM limits during fine-tuning - [yt-dlp JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md) — fixing YouTube JS challenge solver errors and missing formats on Fedora - [MajorWiki Setup & Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) — setting up MajorWiki and the Obsidian → Gitea → MkDocs publishing pipeline @@ -93,21 +98,14 @@ | Date | Article | Domain | |---|---|---| +| 2026-03-13 | [rmlint: Duplicate File Scanning](03-opensource/productivity/rmlint-duplicate-scanning.md) | Open Source | +| 2026-03-13 | [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md) | Linux | +| 2026-03-13 | [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) | Troubleshooting | | 2026-03-13 | [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md) | Troubleshooting | | 2026-03-12 | [Docker & Caddy Recovery After Reboot](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) | Troubleshooting | | 2026-03-11 | [MajorWiki Setup & Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) | Troubleshooting | | 2026-03-11 | [Obsidian Cache Hang Recovery](05-troubleshooting/obsidian-cache-hang-recovery.md) | Troubleshooting | | 2026-03-11 | [yt-dlp JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md) | Troubleshooting | -| 2026-03-08 | [OBS Studio Setup & Encoding](04-streaming/obs/obs-studio-setup-encoding.md) | Streaming | -| 2026-03-08 | [Linux File Permissions](01-linux/files-permissions/linux-file-permissions.md) | Linux | -| 2026-03-08 | [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) | Self-Hosting | -| 2026-03-08 | [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md) | Self-Hosting | -| 2026-03-08 | [Package Management Reference](01-linux/packages/package-management-reference.md) | Linux | -| 2026-03-08 | [Bash Scripting Patterns](01-linux/shell-scripting/bash-scripting-patterns.md) | Linux | -| 2026-03-08 | [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md) | Self-Hosting | -| 2026-03-08 | [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) | Linux | -| 2026-03-08 | [Ansible Getting Started](01-linux/shell-scripting/ansible-getting-started.md) | Linux | -| 2026-03-08 | [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md) | Self-Hosting | --- @@ -119,6 +117,5 @@ | Docker Compose networking deep dive | Self-Hosting | High | No | | Troubleshooting NVIDIA on Linux | Troubleshooting | Medium | No | | Pi-hole setup and local DNS | Self-Hosting | Medium | No | -| OBS audio routing on Linux (PipeWire) | Streaming | Medium | No | | Nextcloud setup with Docker | Self-Hosting | Medium | No | | tmux basics | Linux | Low | No |