Compare commits

...

51 Commits

Author SHA1 Message Date
56f1014f73 Add troubleshooting article: wget/curl URLs with special characters
Covers shell quoting for URLs containing &, ?, #, and other characters
that Bash interprets as operators. Common gotcha when downloading from
CDNs with token-based URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:18:34 -04:00
5af934a6c6 wiki: update SSH docs with bash.exe default shell fix and Windows admin key auth
- ssh-config-key-management: add Windows OpenSSH admin user key auth section
  (administrators_authorized_keys, BOM-free writing, ACL requirements)
- windows-openssh-wsl-default-shell: add bash.exe as recommended fix (Option 1),
  demote PowerShell to Option 2, add shell-not-found diagnostic tip
- windows-sshd-stops-after-reboot: fix stale wsl.exe reference to bash.exe
- index/README: update Recently Updated table and article descriptions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:01:36 -04:00
84a1893e80 wiki: fix article count to 73, update frontmatter timestamps
Corrected inflated article count (was 76, actual is 73).
Updated domain breakdown and frontmatter timestamps from Obsidian.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:51:23 -04:00
daa771760b wiki: add WSL OpenSSH default shell + Ansible world-writable mount articles
Two new troubleshooting articles from today's MajorRig/MajorMac Ansible setup:
- Windows OpenSSH WSL default shell breaks remote SSH commands
- Ansible silently ignores ansible.cfg on WSL2 world-writable mounts

Article count: 76

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:23:02 -04:00
c66d3a6fd0 Update UFW article: add web server ports lesson from tttpod outage
Adds a section documenting how missing HTTP/HTTPS rules caused a
site outage on tttpod, and updates the fleet reference table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 03:57:27 -04:00
1a00fef199 Update wiki indexes for WordPress login jail article
Article count 73 → 74. Added to SUMMARY.md, index.md, README.md,
and 02-selfhosting/index.md (which was also missing 5 other security
articles from prior sessions).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:07:08 -04:00
9a7e43e67d Add wiki article: Fail2ban WordPress login brute force jail
Access-log-based filter for wp-login.php brute force detection without
requiring the WP fail2ban plugin. Documents the backend=polling gotcha
on Ubuntu 24.04 and manual banning workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:04:13 -04:00
6592eb4fea wiki: audit fixes — broken links, wikilinks, frontmatter, stale content (66 files)
- Fixed 4 broken markdown links (bad relative paths in See Also sections)
- Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment)
- Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3)
- Converted 67 Obsidian wikilinks to relative markdown links or plain text
- Added YAML frontmatter to 35 articles missing it entirely
- Completed frontmatter on 8 articles with missing fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:16:29 -04:00
6da77c2db7 wiki: remove Obsidian-style hashtag tags from 12 articles
These #hashtag tag lines render as plain text on MkDocs. All articles
already have tags in YAML frontmatter, so the inline tags were redundant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:03:28 -04:00
6f53b7c6db wiki: fix broken wikilinks in index and README related sections
Removed Obsidian [[wikilinks]] pointing to vault-only docs (01-Phases, majorlab)
that don't resolve on the MkDocs site. Kept deploy status as a proper markdown link.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:59:17 -04:00
6d81e7f020 wiki: add 4 new articles from archive, merge 8 archive notes into existing articles (73 articles)
New: mdadm RAID rebuild, Mastodon instance tuning, Ventoy, Fedora networking/kernel recovery.
Merged: Glacier Deep Archive into rsync, SpamAssassin into hardening checklist,
OBS captions/VLC capture into OBS setup, yt-dlp subtitles/temp fix into yt-dlp.
Updated index.md, README.md, SUMMARY.md with 21 previously missing articles.
Fixed merge conflict in index.md Recently Updated table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:55:53 -04:00
2045c090c0 wiki: add UFW firewall management article and pending articles (63 articles)
New articles: UFW firewall management, Fail2ban Apache 404 scanner jail,
SELinux Fail2ban execmem fix, updating n8n Docker, Ansible SSH timeout
during dnf upgrade, n8n proxy X-Forwarded-For fix, macOS mirrored
notification alert loop. Updated dca→dcaprod reference in network overview.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 09:49:48 -04:00
ca7ddb67f2 wiki: add SELinux fail2ban execmem fix + pending articles
New article: selinux-fail2ban-execmem-fix.md — custom policy module
for fail2ban grep execmem denial on Fedora 43.

Also includes previously uncommitted:
- n8n-proxy-trust-x-forwarded-for.md
- fail2ban-apache-404-scanner-jail.md updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 09:51:33 -04:00
6e131637a1 wiki: add backend=polling gotcha to apache-404scan jail article
Global backend=systemd in jail.local silently breaks file-based jails.
Added required backend=polling to config, diagnostic command, and warning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:14:36 -04:00
0df5ace1a2 wiki: add n8n reverse proxy X-Forwarded-For trust fix article
Documents the N8N_PROXY_HOPS env var needed for n8n behind Caddy/Nginx
when N8N_TRUST_PROXY alone is insufficient in newer versions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:48:01 -04:00
6dccc43d15 Add n8n Docker update guide
Covers version checking, pinned-tag update process, SQLite password
reset, and why Arcane may not catch updates when the latest tag lags
behind npm releases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 15:08:30 -04:00
MajorLinux
ed810ebdf9 Add: macOS repeating alert tone from mirrored iPhone notification 2026-03-30 07:15:09 -04:00
1bb872ef75 Add Ansible SSH timeout troubleshooting article
Documents the SSH keepalive fix for dnf upgrade timeouts on Fedora hosts,
plus the do-agent task guard fix. Also adds Ansible & Fleet Management
section to the troubleshooting index.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:22:48 -04:00
23a35e021b wiki: add fail2ban apache 404 scanner jail article
New guide for custom access-log-based fail2ban jail that catches
rapid-fire 404 vulnerability scanners missed by default error-log jails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:22:19 -04:00
9acd083577 wiki: add fail2ban UFW rule bloat and Apache dirscan jail articles (56 articles)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:54:06 -04:00
cfaee5cf43 wiki: document Nextcloud AIO 20h unhealthy incident and watchdog cron fix
Add troubleshooting article for the 2026-03-27 incident where PHP-FPM
hung after the nightly update cycle. Update the Netdata Docker alarm
tuning article with the dedicated Nextcloud alarm split and the new
watchdog cron deployed to majorlab. (54 articles)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:52:49 -04:00
d37bd60a24 wiki: add systemd session scope failure troubleshooting article
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 11:22:44 -04:00
8c22ee708d merge: resolve conflicts, add SELinux AVC chart article; update indexes to 53
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 03:36:49 -04:00
fb2e3f6168 wiki: add SELinux AVC chart, enriched alerts, new server setup, and pending articles; update indexes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 03:34:33 -04:00
0e640a3fff wiki: add ClamAV safe scheduling article; update Netdata new server setup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:36:49 -04:00
d1e9571761 wiki: update Netdata Docker alarm tuning — add docker_container_down suppression
Nextcloud AIO borgbackup and watchtower exit normally after nightly update/backup
cycles. Added docker_container_down override with chart labels to exclude them,
preventing false alerts. Documents chart labels pattern syntax.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:17:31 -04:00
9e205f60e4 wiki: add Netdata n8n enriched alert pipeline article (51 articles) 2026-03-21 04:25:56 -04:00
c4d3f8e974 wiki: add Tailscale SSH reauth article; update Netdata Docker alarm tuning (50 articles)
- New: Tailscale SSH unexpected re-authentication prompt — diagnosis and fix
- Updated: netdata-docker-health-alarm-tuning — add delay: up 3m to suppress
  Nextcloud AIO PHP-FPM ~90s startup false alerts; update settings table and notes
- Updated: 05-troubleshooting/index.md and SUMMARY.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 00:12:52 -04:00
4d59856c1e wiki: add Netdata new server deployment guide (49 articles) 2026-03-18 11:00:41 -04:00
38fe720e63 wiki: add Netdata Docker health alarm tuning article; update indexes to 48
- 02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md — new
- lookup extended to 5m average, delay: down 5m to prevent Nextcloud AIO update flapping
- SUMMARY.md, index.md, README.md, deploy status updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 00:10:36 -04:00
59a5cc530e wiki: add Windows sshd and Ollama/Tailscale sleep articles; update indexes to 47
- 05-troubleshooting/networking/windows-sshd-stops-after-reboot.md
- 05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md
- SUMMARY.md, index.md, README.md: count 45 → 47, add 5 missing articles (3 from 2026-03-16 + 2 today)
- MajorWiki-Deploy-Status.md: session update 2026-03-17

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 21:20:15 -04:00
e8598cfac8 wiki: add WSL2 backup, Fedora43 training env, Ansible upgrades, firewalld mail ports articles; update indexes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 16:47:02 -04:00
6a4681dc4b merge: resolve conflicts, keep firewalld article and count 42
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 16:15:50 -04:00
279c094afc wiki: add firewalld mail ports reset article + session updates
- New article: firewalld mail ports wiped after reload (IMAP + webmail outage)
- New article: Plex 4K codec compatibility (Apple TV)
- New article: mdadm RAID recovery after USB hub disconnect
- Updated yt-dlp article
- Updated all index files: SUMMARY.md, index.md, README.md, category indexes
- Article count: 41 → 42

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 16:15:02 -04:00
7fb739d3a2 wiki: add Plex 4K codec guide and mdadm USB recovery; update yt-dlp, indexes
New articles:
- 04-streaming/plex/plex-4k-codec-compatibility.md
- 05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md

Updated:
- yt-dlp.md: Plex section and config reflect new HEVC auto-convert workflow
- SUMMARY.md, index.md, README.md, section indexes: 39 → 41 articles
- MajorWiki-Deploy-Status.md: count + date

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 07:12:09 -04:00
0bcc2c822a wiki: add SELinux vmail and gitea-runner articles; update indexes
- New: SELinux Fixing Dovecot Mail Spool Context (/var/vmail)
  Corrected fix — mail_spool_t only, no dovecot_tmp_t on tmp/ dirs.
  Includes warning and recovery steps for the Postfix delivery outage.
- New: Gitea Actions Runner Boot Race Condition Fix
  network-online.target dependency, RestartSec=10, /etc/hosts workaround.
- Updated SUMMARY.md, index.md, README.md, 05-troubleshooting/index.md
- Article count: 37 → 39; MajorWiki-Deploy-Status updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 22:49:01 -04:00
3159bbfb48 merge: resolve conflicts, keep new IMAP self-ban article
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 22:03:16 -04:00
deb32ce756 wiki: expand SUMMARY.md to include all articles across all sections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 11:01:21 -04:00
b81c8feda0 wiki: add alternatives section with SearXNG, FreshRSS, and Gitea
Add three new articles to 03-opensource/alternatives/:
- SearXNG: private metasearch, Open WebUI integration
- FreshRSS: self-hosted RSS, mobile app sync, OPML portability
- Gitea: lightweight GitHub alternative, webhook pipeline

Article count: 33 → 36. Open source section: 6 → 9.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 00:37:42 -04:00
31d0a9806d wiki: add yt-dlp article to media-creative section
Cover installation, Plex-optimized format selection, playlist
downloading, config file, and background session usage. Cross-reference
existing JS challenge troubleshooting article.

Article count: 32 → 33. Open source section: 5 → 6.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 00:33:58 -04:00
6e0ceb0972 wiki: add Vaultwarden article to privacy-security section
Add 03-opensource/privacy-security/vaultwarden.md covering deployment
with Docker Compose, Caddy reverse proxy, client setup, access model
via Tailscale, and SQLite backup. Remove KeePassXC from backlog.

Article count: 31 → 32. Open source section: 4 → 5.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 23:48:40 -04:00
4f3e5877ae wiki: add dev-tools section with tmux, screen, and rsync articles
Add three new articles to 03-opensource/dev-tools/:
- tmux: persistent terminal sessions, background jobs, capture-pane
- screen: lightweight alternative, comparison table
- rsync: flags reference, resumable transfers, SSH usage

Update all indexes (SUMMARY, section index, main index, README).
Article count: 28 → 31. Remove tmux from writing backlog.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 23:33:38 -04:00
2e5512ed97 wiki: document maintenance protocol 2026-03-13 22:50:59 -04:00
4bfb99efa6 wiki: update main index and readme with new articles 2026-03-13 22:49:56 -04:00
697269f574 merge: resolve conflicts in summary and troubleshooting index 2026-03-13 22:46:42 -04:00
2861cade55 wiki: add manual update guide for Gemini CLI 2026-03-13 22:45:52 -04:00
b59f6bb6b1 WSyncing from MajorMaciki expansion (Phase 10): 3 new articles and updated indices 2026-03-13 12:02:11 -04:00
70d9657b7f vault backup: 2026-03-13 01:39:29 2026-03-13 01:39:29 -04:00
c4673f70e0 Add .gitattributes to enforce LF line endings 2026-03-13 01:37:45 -04:00
9d537dec5f Resolve merge conflicts in SUMMARY.md and index.md; add fail2ban self-ban article 2026-03-13 01:35:11 -04:00
639b23f861 vault backup: 2026-03-13 01:31:25 2026-03-13 01:31:25 -04:00
86 changed files with 6144 additions and 179 deletions

18
.gitattributes vendored Normal file
View File

@@ -0,0 +1,18 @@
# Normalize line endings to LF for all text files
* text=auto eol=lf
# Explicitly handle markdown
*.md text eol=lf
# Explicitly handle config files
*.yml text eol=lf
*.yaml text eol=lf
*.json text eol=lf
*.toml text eol=lf
# Binary files — don't touch
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.pdf binary

View File

View File

@@ -86,5 +86,5 @@ Be specific when asking for help. Include your distro and version, what you trie
## See Also
- [[wsl2-instance-migration-fedora43]]
- [[managing-linux-services-systemd-ansible]]
- [wsl2-instance-migration-fedora43](wsl2-instance-migration-fedora43.md)
- [managing-linux-services-systemd-ansible](../process-management/managing-linux-services-systemd-ansible.md)

View File

@@ -0,0 +1,86 @@
---
title: WSL2 Backup via PowerShell Scheduled Task
domain: linux
category: distro-specific
tags:
- wsl2
- windows
- backup
- powershell
- majorrig
status: published
created: '2026-03-16'
updated: '2026-03-16'
---
# WSL2 Backup via PowerShell Scheduled Task
WSL2 distributions are stored as a VHDX file on disk. Unlike traditional VMs, there's no built-in snapshot or backup mechanism. This article covers a simple weekly backup strategy using `wsl --export` and a PowerShell scheduled task.
## The Short Answer
Save this as `C:\Users\majli\Scripts\backup-wsl.ps1` and register it as a weekly scheduled task.
## Backup Script
```powershell
$BackupDir = "D:\WSL\Backups"
$Date = Get-Date -Format "yyyy-MM-dd"
$BackupFile = "$BackupDir\FedoraLinux-43-$Date.tar"
$MaxBackups = 3
New-Item -ItemType Directory -Force -Path $BackupDir | Out-Null
# Must shut down WSL first — export fails if VHDX is locked
Write-Host "Shutting down WSL2..."
wsl --shutdown
Start-Sleep -Seconds 5
Write-Host "Backing up FedoraLinux-43 to $BackupFile..."
wsl --export FedoraLinux-43 $BackupFile
if ($LASTEXITCODE -eq 0) {
Write-Host "Backup complete: $BackupFile"
Get-ChildItem "$BackupDir\FedoraLinux-43-*.tar" |
Sort-Object LastWriteTime -Descending |
Select-Object -Skip $MaxBackups |
Remove-Item -Force
Write-Host "Cleanup done. Keeping last $MaxBackups backups."
} else {
Write-Host "ERROR: Backup failed!"
}
```
## Register the Scheduled Task
Run in PowerShell as Administrator:
```powershell
$Action = New-ScheduledTaskAction -Execute "PowerShell.exe" `
-Argument "-NonInteractive -File C:\Users\majli\Scripts\backup-wsl.ps1"
$Trigger = New-ScheduledTaskTrigger -Weekly -DaysOfWeek Sunday -At 2am
$Settings = New-ScheduledTaskSettingsSet -StartWhenAvailable -RunOnlyIfNetworkAvailable:$false
Register-ScheduledTask -TaskName "WSL2 Backup - FedoraLinux43" `
-Action $Action -Trigger $Trigger -Settings $Settings `
-RunLevel Highest -Force
```
## Restore from Backup
```powershell
wsl --unregister FedoraLinux-43
wsl --import FedoraLinux-43 D:\WSL\Fedora43 D:\WSL\Backups\FedoraLinux-43-YYYY-MM-DD.tar
```
Then fix the default user — after import WSL resets to root. See [WSL2 Instance Migration](wsl2-instance-migration-fedora43.md) for the `/etc/wsl.conf` fix.
## Gotchas
- **`wsl --export` fails with `ERROR_SHARING_VIOLATION` if WSL is running.** The script includes `wsl --shutdown` before export to handle this. Any active WSL sessions will be terminated — schedule the task for a time when WSL is idle (2am works well).
- **Backblaze picks up D:\WSL\Backups\ automatically** if D: drive is in scope — provides offsite backup without extra config.
- **Each backup tar is ~500MB1GB** depending on what's installed. Keep MaxBackups at 3 to balance retention vs disk usage.
## See Also
- [WSL2 Instance Migration](wsl2-instance-migration-fedora43.md)
- [WSL2 Training Environment Rebuild](wsl2-rebuild-fedora43-training-env.md)

View File

@@ -97,5 +97,5 @@ alias clean='sudo dnf clean all'
## See Also
- [[Managing disk space on MajorRig]]
- [[Unsloth QLoRA fine-tuning setup]]
- Managing disk space on MajorRig
- Unsloth QLoRA fine-tuning setup

View File

@@ -0,0 +1,203 @@
---
title: WSL2 Fedora 43 Training Environment Rebuild
domain: linux
category: distro-specific
tags:
- wsl2
- fedora
- unsloth
- pytorch
- cuda
- majorrig
- majortwin
status: published
created: '2026-03-16'
updated: '2026-03-16'
---
# WSL2 Fedora 43 Training Environment Rebuild
How to rebuild the MajorTwin training environment from scratch on MajorRig after a WSL2 loss. Covers Fedora 43 install, Python 3.11 via pyenv, PyTorch with CUDA, Unsloth, and llama.cpp for GGUF conversion.
## The Short Answer
```bash
# 1. Install Fedora 43 and move to D:
wsl --install -d FedoraLinux-43 --no-launch
wsl --export FedoraLinux-43 D:\WSL\fedora43.tar
wsl --unregister FedoraLinux-43
wsl --import FedoraLinux-43 D:\WSL\Fedora43 D:\WSL\fedora43.tar
# 2. Set default user
echo -e "[boot]\nsystemd=true\n[user]\ndefault=majorlinux" | sudo tee /etc/wsl.conf
useradd -m -G wheel majorlinux && passwd majorlinux
echo "%wheel ALL=(ALL) ALL" | sudo tee /etc/sudoers.d/wheel
# 3. Install Python 3.11 via pyenv, PyTorch, Unsloth
# See full steps below
```
## Step 1 — System Packages
```bash
sudo dnf update -y
sudo dnf install -y git curl wget tmux screen htop rsync unzip \
python3 python3-pip python3-devel gcc gcc-c++ make cmake \
ninja-build pkg-config openssl-devel libffi-devel \
gawk patch readline-devel sqlite-devel
```
## Step 2 — Python 3.11 via pyenv
Fedora 43 ships Python 3.13. Unsloth requires 3.11. Use pyenv:
```bash
curl https://pyenv.run | bash
# Add to ~/.bashrc
export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init - bash)"
source ~/.bashrc
pyenv install 3.11.9
pyenv global 3.11.9
```
The tkinter warning during install is harmless — it's not needed for training.
## Step 3 — Training Virtualenv + PyTorch
```bash
mkdir -p ~/majortwin/{staging,datasets,outputs,scripts}
python -m venv ~/majortwin/venv
source ~/majortwin/venv/bin/activate
pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
# Verify GPU
python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
```
Expected output: `True NVIDIA GeForce RTX 3080 Ti`
## Step 4 — Unsloth + Training Stack
```bash
source ~/majortwin/venv/bin/activate
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install transformers datasets accelerate peft trl bitsandbytes \
sentencepiece protobuf scipy einops
# Pin transformers for unsloth-zoo compatibility
pip install "transformers<=5.2.0"
# Verify
python -c "import unsloth; print('Unsloth OK')"
```
> [!warning] Never run `pip install -r requirements.txt` from inside llama.cpp while the training venv is active. It installs CPU-only PyTorch and downgrades transformers, breaking the CUDA setup.
## Step 5 — llama.cpp (CPU-only for GGUF conversion)
CUDA 12.8 is incompatible with Fedora 43's glibc for compiling llama.cpp (math function conflicts in `/usr/include/bits/mathcalls.h`). Build CPU-only — it's sufficient for GGUF conversion, which doesn't need GPU:
```bash
# Install GCC 14 (CUDA 12.8 doesn't support GCC 15 which Fedora 43 ships)
sudo dnf install -y gcc14 gcc14-c++
cd ~/majortwin
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build \
-DGGML_CUDA=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER=/usr/bin/gcc-14 \
-DCMAKE_CXX_COMPILER=/usr/bin/g++-14
cmake --build build --config Release -j$(nproc) 2>&1 | tee /tmp/llama_build.log &
tail -f /tmp/llama_build.log
```
Verify:
```bash
ls ~/majortwin/llama.cpp/build/bin/llama-quantize && echo "OK"
ls ~/majortwin/llama.cpp/build/bin/llama-cli && echo "OK"
```
## Step 6 — Shell Environment
```bash
cat >> ~/.bashrc << 'EOF'
# MajorInfrastructure Paths
export VAULT="/mnt/c/Users/majli/Documents/MajorVault"
export MAJORANSIBLE="/mnt/d/MajorAnsible"
export MAJORTWIN_D="/mnt/d/MajorTwin"
export MAJORTWIN_WSL="$HOME/majortwin"
export LLAMA_CPP="$HOME/majortwin/llama.cpp"
# Venv
alias mtwin='source $MAJORTWIN_WSL/venv/bin/activate && cd $MAJORTWIN_WSL'
alias vault='cd $VAULT'
alias ll='ls -lah --color=auto'
# SSH Fleet Aliases
alias majorhome='ssh majorlinux@100.120.209.106'
alias dca='ssh root@100.104.11.146'
alias majortoot='ssh root@100.110.197.17'
alias majorlinuxvm='ssh root@100.87.200.5'
alias majordiscord='ssh root@100.122.240.83'
alias majorlab='ssh root@100.86.14.126'
alias majormail='ssh root@100.84.165.52'
alias teelia='ssh root@100.120.32.69'
alias tttpod='ssh root@100.84.42.102'
alias majorrig='ssh majorlinux@100.98.47.29' # port 2222 retired 2026-03-25, fleet uses port 22
# DNF5
alias update='sudo dnf upgrade --refresh'
alias install='sudo dnf install'
alias clean='sudo dnf clean all'
# MajorTwin helpers
stage_dataset() {
cp "$VAULT/20-Projects/MajorTwin/03-Datasets/$1" "$MAJORTWIN_WSL/datasets/"
echo "Staged: $1"
}
export_gguf() {
cp "$MAJORTWIN_WSL/outputs/$1" "$MAJORTWIN_D/models/"
echo "Exported: $1 → $MAJORTWIN_D/models/"
}
EOF
source ~/.bashrc
```
## Key Rules
- **Always activate venv before pip installs:** `source ~/majortwin/venv/bin/activate`
- **Never train from /mnt/c or /mnt/d** — stage files in `~/majortwin/staging/` first
- **Never put ML artifacts inside MajorVault** — models, venvs, artifacts go on D: drive
- **Max viable training model:** 7B at QLoRA 4-bit (RTX 3080 Ti, 12GB VRAM)
- **Current base model:** Qwen2.5-7B-Instruct (ChatML format — stop token: `<|im_end|>` only)
- **Transformers must be pinned:** `pip install "transformers<=5.2.0"` for unsloth-zoo compatibility
## D: Drive Layout
```
D:\MajorTwin\
models\ ← finished GGUFs
datasets\ ← dataset archives
artifacts\ ← training run artifacts
training-runs\ ← logs, checkpoints
D:\WSL\
Fedora43\ ← WSL2 VHDX
Backups\ ← weekly WSL2 backup tars
```
## See Also
- [WSL2 Instance Migration](wsl2-instance-migration-fedora43.md)
- [WSL2 Backup via PowerShell](wsl2-backup-powershell.md)

View File

@@ -152,6 +152,6 @@ find /var/www/html -type d -exec chmod 755 {} \;
## See Also
- [[linux-server-hardening-checklist]]
- [[ssh-config-key-management]]
- [[bash-scripting-patterns]]
- [linux-server-hardening-checklist](../../02-selfhosting/security/linux-server-hardening-checklist.md)
- [ssh-config-key-management](../networking/ssh-config-key-management.md)
- [bash-scripting-patterns](../shell-scripting/bash-scripting-patterns.md)

View File

@@ -1,11 +1,16 @@
---
title: "SSH Config and Key Management"
title: SSH Config and Key Management
domain: linux
category: networking
tags: [ssh, keys, security, linux, remote-access]
tags:
- ssh
- keys
- security
- linux
- remote-access
status: published
created: 2026-03-08
updated: 2026-03-08
updated: 2026-04-07T21:55
---
# SSH Config and Key Management
@@ -129,7 +134,51 @@ If key auth isn't working and the config looks right, permissions are the first
- **`ServerAliveInterval` in your config** keeps connections from timing out on idle sessions. Saves you from the annoyance of reconnecting after stepping away.
- **Never put private keys in cloud storage, Git repos, or Docker images.** It happens more than you'd think.
## Windows OpenSSH: Admin User Key Auth
Windows OpenSSH has a separate key file for users in the `Administrators` group. Regular `~/.ssh/authorized_keys` is **ignored** for admin users unless the `Match Group administrators` block in `sshd_config` is disabled.
### Where keys go
| User type | Key file |
|---|---|
| Regular user | `C:\Users\<user>\.ssh\authorized_keys` |
| Admin user | `C:\ProgramData\ssh\administrators_authorized_keys` |
### Setup (elevated PowerShell)
1. **Enable the Match block** in `C:\ProgramData\ssh\sshd_config` — both lines must be uncommented:
```
Match Group administrators
AuthorizedKeysFile __PROGRAMDATA__/ssh/administrators_authorized_keys
```
2. **Write the key file without BOM** — PowerShell 5 defaults to UTF-16LE or UTF-8 with BOM, both of which OpenSSH silently rejects:
```powershell
[System.IO.File]::WriteAllText(
"C:\ProgramData\ssh\administrators_authorized_keys",
"ssh-ed25519 AAAA... user@hostname`n",
[System.Text.UTF8Encoding]::new($false)
)
```
3. **Lock down permissions** — OpenSSH requires strict ACLs:
```powershell
icacls "C:\ProgramData\ssh\administrators_authorized_keys" /inheritance:r /grant "SYSTEM:(F)" /grant "Administrators:(F)"
```
4. **Restart sshd:**
```powershell
Restart-Service sshd
```
### Troubleshooting
- If key auth silently fails, check `Get-WinEvent -LogName OpenSSH/Operational -MaxEvents 10`
- Common cause: BOM in the key file or `sshd_config` — PowerShell file-writing commands are the usual culprit
- If the log says `User not allowed because shell does not exist`, the `DefaultShell` registry path is wrong — see [WSL default shell troubleshooting](../../05-troubleshooting/networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md)
## See Also
- [[linux-server-hardening-checklist]]
- [[managing-linux-services-systemd-ansible]]
- [linux-server-hardening-checklist](../../02-selfhosting/security/linux-server-hardening-checklist.md)
- [managing-linux-services-systemd-ansible](../process-management/managing-linux-services-systemd-ansible.md)

View File

@@ -168,5 +168,5 @@ Flatpak is what I prefer — better sandboxing story, Flathub has most things yo
## See Also
- [[linux-distro-guide-beginners]]
- [[linux-server-hardening-checklist]]
- [linux-distro-guide-beginners](../distro-specific/linux-distro-guide-beginners.md)
- [linux-server-hardening-checklist](../../02-selfhosting/security/linux-server-hardening-checklist.md)

View File

@@ -146,5 +146,5 @@ ansible-playbook -i inventory.ini manage-services.yml
## See Also
- [[wsl2-instance-migration-fedora43]]
- [[tuning-netdata-web-log-alerts]]
- [wsl2-instance-migration-fedora43](../distro-specific/wsl2-instance-migration-fedora43.md)
- [tuning-netdata-web-log-alerts](../../02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md)

View File

@@ -204,5 +204,5 @@ Roles keep things organized and reusable across projects.
## See Also
- [[managing-linux-services-systemd-ansible]]
- [[linux-server-hardening-checklist]]
- [managing-linux-services-systemd-ansible](../process-management/managing-linux-services-systemd-ansible.md)
- [linux-server-hardening-checklist](../../02-selfhosting/security/linux-server-hardening-checklist.md)

View File

@@ -211,5 +211,5 @@ retry 3 10 curl -f https://example.com/health
## See Also
- [[ansible-getting-started]]
- [[managing-linux-services-systemd-ansible]]
- [ansible-getting-started](ansible-getting-started.md)
- [managing-linux-services-systemd-ansible](../process-management/managing-linux-services-systemd-ansible.md)

View File

@@ -0,0 +1,113 @@
---
title: "mdadm — Rebuilding a RAID Array After Reinstall"
domain: linux
category: storage
tags: [mdadm, raid, linux, storage, recovery, homelab]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# mdadm — Rebuilding a RAID Array After Reinstall
If you reinstall the OS on a machine that has an existing mdadm RAID array, the array metadata is still on the disks — you just need to reassemble it. The data isn't gone unless you've overwritten the member disks.
## The Short Answer
```bash
# Scan for existing arrays
sudo mdadm --assemble --scan
# Check what was found
cat /proc/mdstat
```
If that works, your array is back. If not, you'll need to manually identify the member disks and reassemble.
## Step-by-Step Recovery
### 1. Identify the RAID member disks
```bash
# Show mdadm superblock info on each disk/partition
sudo mdadm --examine /dev/sda1
sudo mdadm --examine /dev/sdb1
# Or scan all devices at once
sudo mdadm --examine --scan
```
Look for matching `UUID` fields — disks with the same array UUID belong to the same array.
### 2. Reassemble the array
```bash
# Assemble from specific devices
sudo mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1
# Or let mdadm figure it out from superblocks
sudo mdadm --assemble --scan
```
### 3. Verify the array state
```bash
cat /proc/mdstat
sudo mdadm --detail /dev/md0
```
You want to see `State : active` (or `active, degraded` if a disk is missing). If degraded, the array is still usable but should be rebuilt.
### 4. Update mdadm.conf so it persists across reboots
```bash
# Generate the config
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
# Fedora/RHEL — rebuild initramfs so the array is found at boot
sudo dracut --force
# Debian/Ubuntu — update initramfs
sudo update-initramfs -u
```
### 5. Mount the filesystem
```bash
# Check the filesystem
sudo fsck /dev/md0
# Mount
sudo mount /dev/md0 /mnt/raid
# Add to fstab for auto-mount
echo '/dev/md0 /mnt/raid ext4 defaults 0 2' | sudo tee -a /etc/fstab
```
## Rebuilding a Degraded Array
If a disk failed or was replaced:
```bash
# Add the new disk to the existing array
sudo mdadm --manage /dev/md0 --add /dev/sdc1
# Watch the rebuild progress
watch cat /proc/mdstat
```
Rebuild time depends on array size and disk speed. The array is usable during rebuild but with degraded performance.
## Gotchas & Notes
- **Don't `--create` when you mean `--assemble`.** `--create` initializes a new array and will overwrite existing superblocks. `--assemble` brings an existing array back online.
- **Superblock versions matter.** Modern mdadm uses 1.2 superblocks by default. If the array was created with an older version, specify `--metadata=0.90` during assembly.
- **RAID is not a backup.** mdadm protects against disk failure, not against accidental deletion, ransomware, or filesystem corruption. Pair it with rsync or Restic for actual backups.
- **Check SMART status on all member disks** after a reinstall. If you're reassembling because a disk failed, make sure the remaining disks are healthy.
Reference: [mdadm — How to rebuild RAID array after fresh install (Unix & Linux Stack Exchange)](https://unix.stackexchange.com/questions/593836/mdadm-how-to-rebuild-raid-array-after-fresh-install)
## See Also
- [snapraid-mergerfs-setup](snapraid-mergerfs-setup.md)
- [rsync-backup-patterns](../../02-selfhosting/storage-backup/rsync-backup-patterns.md)

View File

@@ -0,0 +1,86 @@
---
title: "SnapRAID & MergerFS Storage Setup"
domain: linux
category: storage
tags: [snapraid, mergerfs, storage, parity, raid, majorraid]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# SnapRAID & MergerFS Storage Setup
## Problem
Managing a collection of mismatched hard drives as a single pool while maintaining data redundancy (parity) without the overhead or risk of a traditional RAID 5/6 array.
## Solution
A combination of **MergerFS** for pooling and **SnapRAID** for parity. This is ideal for "mostly static" media storage (like MajorRAID) where files aren't changing every second.
### 1. Concepts
- **MergerFS:** A FUSE-based union filesystem. It takes multiple drives/folders and presents them as a single mount point. It does NOT provide redundancy.
- **SnapRAID:** A backup/parity tool for disk arrays. It creates parity information on a dedicated drive. It is NOT real-time (you must run `snapraid sync`).
### 2. Implementation Strategy
1. **Clean the Pool:** Use `rmlint` to clear duplicates and reclaim space.
2. **Identify the Parity Drive:** Choose your largest drive (or one equal to the largest data drive) to hold the parity information.
3. **Configure MergerFS:** Pool the data drives into a single mount point.
4. **Configure SnapRAID:** Point SnapRAID to the data drives and the parity drive.
### 3. MergerFS Config (/etc/fstab)
On majorhome, the pool mounts three ext4 drives to `/majorRAID`:
```fstab
/mnt/disk1:/mnt/disk2:/mnt/disk3 /majorRAID fuse.mergerfs defaults,allow_other,cache.files=off,use_ino,category.create=mfs,minfreespace=20G,fsname=mergerfsPool 0 0
```
Adjust the source paths and mount point to match your setup. Each `/mnt/diskN` is an individual ext4 drive mounted separately — MergerFS unions them into the single `/majorRAID` path.
### 4. SnapRAID Config (/etc/snapraid.conf)
> **Note:** SnapRAID is not yet active on majorhome — a 12TB parity drive purchase is deferred. The config below is the planned setup.
```conf
# Parity file location
parity /mnt/parity/snapraid.parity
# Data drives
content /var/snapraid/snapraid.content
content /mnt/disk1/.snapraid.content
content /mnt/disk2/.snapraid.content
content /mnt/disk3/.snapraid.content
data d1 /mnt/disk1/
data d2 /mnt/disk2/
data d3 /mnt/disk3/
# Exclusions
exclude /lost+found/
exclude /tmp/
exclude .DS_Store
```
---
## Maintenance
### SnapRAID Sync
Run this daily (via cron) or after adding large amounts of data:
```bash
snapraid sync
```
### SnapRAID Scrub
Run this weekly to check for bitrot:
```bash
snapraid scrub
```
---

View File

@@ -0,0 +1,38 @@
---
title: "Network Overview"
domain: selfhosting
category: dns-networking
tags: [tailscale, networking, infrastructure, dns, vpn]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# 🌐 Network Overview
The **MajorsHouse** infrastructure is connected via a private **Tailscale** mesh network. This allows secure, peer-to-peer communication between devices across different geographic locations (US and UK) without exposing services to the public internet.
## 🏛️ Infrastructure Summary
- **Address Space:** 100.x.x.x (Tailscale CGNAT)
- **Management:** Centralized via **Ansible** (`MajorAnsible` repo)
- **Host Groupings:** Functional (web, mail, homelab, bots), OS (Fedora, Ubuntu), and Location (US, UK).
## 🌍 Geographic Nodes
| Host | Location | IP | OS |
|---|---|---|---|
| `dcaprod` | 🇺🇸 US | 100.104.11.146 | Ubuntu 24.04 |
| `majortoot` | 🇺🇸 US | 100.110.197.17 | Ubuntu 24.04 |
| `majorhome` | 🇺🇸 US | 100.120.209.106 | Fedora 43 |
| `teelia` | 🇬🇧 UK | 100.120.32.69 | Ubuntu 24.04 |
## 🔗 Tailscale Setup
Tailscale is configured as a persistent service on all nodes. Key features used include:
- **Tailscale SSH:** Enabled for secure management via Ansible.
- **MagicDNS:** Used for internal hostname resolution (e.g., `majorlab.tailscale.net`).
- **ACLs:** Managed via the Tailscale admin console to restrict cross-group communication where necessary.
---
*Last updated: 2026-03-04*

View File

@@ -140,6 +140,6 @@ Now any device on your home LAN is reachable from anywhere on the tailnet, even
## See Also
- [[self-hosting-starter-guide]]
- [[linux-server-hardening-checklist]]
- [[setting-up-caddy-reverse-proxy]]
- [self-hosting-starter-guide](../docker/self-hosting-starter-guide.md)
- [linux-server-hardening-checklist](../security/linux-server-hardening-checklist.md)
- [setting-up-caddy-reverse-proxy](../reverse-proxy/setting-up-caddy-reverse-proxy.md)

View File

@@ -164,5 +164,5 @@ Don't jump straight to the nuclear option. Only use `-v` if you want a completel
## See Also
- [[docker-vs-vms-homelab]]
- [[tuning-netdata-web-log-alerts]]
- [docker-vs-vms-homelab](docker-vs-vms-homelab.md)
- [tuning-netdata-web-log-alerts](../monitoring/tuning-netdata-web-log-alerts.md)

View File

@@ -0,0 +1,157 @@
---
title: "Docker Healthchecks"
domain: selfhosting
category: docker
tags: [docker, healthcheck, monitoring, uptime-kuma, compose]
status: published
created: 2026-03-23
updated: 2026-03-23
---
# Docker Healthchecks
A Docker healthcheck tells the daemon (and any monitoring tool) whether a container is actually working — not just running. Without one, a container shows as `Up` even if the app inside is crashed, deadlocked, or waiting on a dependency.
## Why It Matters
Tools like Uptime Kuma report containers without healthchecks as:
> Container has not reported health and is currently running. As it is running, it is considered UP. Consider adding a health check for better service visibility.
A healthcheck upgrades that to a real `(healthy)` or `(unhealthy)` status, making monitoring meaningful.
## Basic Syntax (docker-compose)
```yaml
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
| Field | Description |
|---|---|
| `test` | Command to run. Exit 0 = healthy, non-zero = unhealthy. |
| `interval` | How often to run the check. |
| `timeout` | How long to wait before marking as failed. |
| `retries` | Failures before marking `unhealthy`. |
| `start_period` | Grace period on startup before failures count. |
## Common Patterns
### HTTP service (wget — available in Alpine)
```yaml
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:2368/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
### HTTP service (curl)
```yaml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
### MySQL / MariaDB
```yaml
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-psecret"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20s
```
### PostgreSQL
```yaml
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
```
### Redis
```yaml
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
```
### TCP port check (no curl/wget available)
```yaml
healthcheck:
test: ["CMD-SHELL", "nc -z localhost 8080 || exit 1"]
interval: 30s
timeout: 5s
retries: 3
```
## Using Healthchecks with `depends_on`
Healthchecks enable proper startup ordering. Instead of a fixed sleep, a dependent container waits until its dependency is actually ready:
```yaml
services:
app:
depends_on:
db:
condition: service_healthy
db:
image: mysql:8.0
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20s
```
This prevents the classic race condition where the app starts before the database is ready to accept connections.
## Checking Health Status
```bash
# See health status in container list
docker ps
# Get detailed health info including last check output
docker inspect --format='{{json .State.Health}}' <container> | jq
```
## Ghost Example
Ghost (Alpine-based) uses `wget` rather than `curl`:
```yaml
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:2368/ghost/api/v4/admin/site/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
## Gotchas & Notes
- **Alpine images** don't have `curl` by default — use `wget` or install curl in the image.
- **`start_period`** is critical for slow-starting apps (databases, JVM services). Failures during this window don't count toward `retries`.
- **`CMD` vs `CMD-SHELL`** — use `CMD` for direct exec (no shell needed), `CMD-SHELL` when you need pipes, `&&`, or shell builtins.
- **Uptime Kuma** will pick up Docker healthcheck status automatically when monitoring via the Docker socket — no extra config needed.
## See Also
- [debugging-broken-docker-containers](debugging-broken-docker-containers.md)
- [netdata-docker-health-alarm-tuning](../monitoring/netdata-docker-health-alarm-tuning.md)

View File

@@ -91,5 +91,5 @@ The two coexist fine on the same host. Docker handles the service layer, KVM han
## See Also
- [[managing-linux-services-systemd-ansible]]
- [[tuning-netdata-web-log-alerts]]
- [managing-linux-services-systemd-ansible](../../01-linux/process-management/managing-linux-services-systemd-ansible.md)
- [tuning-netdata-web-log-alerts](../monitoring/tuning-netdata-web-log-alerts.md)

View File

@@ -110,6 +110,6 @@ Tailscale is the easiest and safest starting point for personal use.
## See Also
- [[docker-vs-vms-homelab]]
- [[debugging-broken-docker-containers]]
- [[linux-server-hardening-checklist]]
- [docker-vs-vms-homelab](docker-vs-vms-homelab.md)
- [debugging-broken-docker-containers](debugging-broken-docker-containers.md)
- [linux-server-hardening-checklist](../security/linux-server-hardening-checklist.md)

View File

@@ -23,7 +23,14 @@ Guides for running your own services at home, including Docker, reverse proxies,
## Monitoring
- [Tuning Netdata Web Log Alerts](monitoring/tuning-netdata-web-log-alerts.md)
- [Tuning Netdata Docker Health Alarms](monitoring/netdata-docker-health-alarm-tuning.md)
- [Deploying Netdata to a New Server](monitoring/netdata-new-server-setup.md)
## Security
- [Linux Server Hardening Checklist](security/linux-server-hardening-checklist.md)
- [Standardizing unattended-upgrades with Ansible](security/ansible-unattended-upgrades-fleet.md)
- [Fail2ban Custom Jail: Apache 404 Scanner Detection](security/fail2ban-apache-404-scanner-jail.md)
- [Fail2ban Custom Jail: WordPress Login Brute Force](security/fail2ban-wordpress-login-jail.md)
- [SELinux: Fixing Fail2ban grep execmem Denial](security/selinux-fail2ban-execmem-fix.md)
- [UFW Firewall Management](security/ufw-firewall-management.md)

View File

@@ -0,0 +1,157 @@
---
title: "Tuning Netdata Docker Health Alarms to Prevent Update Flapping"
domain: selfhosting
category: monitoring
tags: [netdata, docker, nextcloud, alarms, health, monitoring]
status: published
created: 2026-03-18
updated: 2026-03-28
---
# Tuning Netdata Docker Health Alarms to Prevent Update Flapping
Netdata's default `docker_container_unhealthy` alarm fires on a 10-second average with no delay. When Nextcloud AIO (or any stack with a watchtower/auto-update setup) does its nightly update cycle, containers restart in sequence and briefly show as unhealthy — generating a flood of false alerts.
## The Default Alarm
```ini
template: docker_container_unhealthy
on: docker.container_health_status
every: 10s
lookup: average -10s of unhealthy
warn: $this > 0
```
A single container being unhealthy for 10 seconds triggers it. No grace period, no delay.
## The Fix
Create a custom override at `/etc/netdata/health.d/docker.conf` (maps to the Netdata config volume if running in Docker). This file takes precedence over the stock config in `/usr/lib/netdata/conf.d/health.d/docker.conf`.
### General Container Alarm
This alarm covers all containers **except** `nextcloud-aio-nextcloud`, which gets its own dedicated alarm (see below).
```ini
# Custom override — reduces flapping during nightly container updates.
# General container unhealthy alarm — all containers except nextcloud-aio-nextcloud
template: docker_container_unhealthy
on: docker.container_health_status
class: Errors
type: Containers
component: Docker
units: status
every: 30s
lookup: average -5m of unhealthy
chart labels: container_name=!nextcloud-aio-nextcloud *
warn: $this > 0
delay: up 3m down 5m multiplier 1.5 max 30m
summary: Docker container ${label:container_name} health
info: ${label:container_name} docker container health status is unhealthy
to: sysadmin
```
| Setting | Default | Tuned | Effect |
|---|---|---|---|
| `every` | 10s | 30s | Check less frequently |
| `lookup` | average -10s | average -5m | Smooths transient unhealthy samples over 5 minutes |
| `delay: up 3m` | none | 3m | Won't fire until unhealthy condition persists for 3 continuous minutes |
| `delay: down 5m` | none | 5m (max 30m) | Grace period after recovery before clearing |
### Dedicated Nextcloud AIO Alarm
Added 2026-03-23, updated 2026-03-28. The `nextcloud-aio-nextcloud` container needs a more lenient window than other containers. Its healthcheck (`/healthcheck.sh`) verifies PostgreSQL connectivity (port 5432) and PHP-FPM (port 9000). PHP-FPM takes ~90 seconds to warm up after a normal restart — but during nightly AIO update cycles, the full startup (occ upgrade, app updates, migrations) can take 5+ minutes. On 2026-03-27, a startup hung and left the container unhealthy for 20 hours until the next nightly cycle replaced it.
The dedicated alarm uses a 10-minute lookup window and 10-minute delay to absorb normal startup, while still catching sustained failures:
```ini
# Dedicated alarm for nextcloud-aio-nextcloud — lenient window to absorb nightly update cycle
# PHP-FPM can take 5+ minutes to warm up; only alert on sustained failure
template: docker_nextcloud_unhealthy
on: docker.container_health_status
class: Errors
type: Containers
component: Docker
units: status
every: 30s
lookup: average -10m of unhealthy
chart labels: container_name=nextcloud-aio-nextcloud
warn: $this > 0
delay: up 10m down 5m multiplier 1.5 max 30m
summary: Nextcloud container health sustained
info: nextcloud-aio-nextcloud has been unhealthy for a sustained period — not a transient update blip
to: sysadmin
```
## Watchdog Cron: Auto-Restart on Sustained Unhealthy
If the Nextcloud container stays unhealthy for more than 1 hour (well past any normal startup window), a cron watchdog on majorlab auto-restarts it and logs the event. This was added 2026-03-28 after an incident where the container sat unhealthy for 20 hours until the next nightly backup cycle replaced it.
**File:** `/etc/cron.d/nextcloud-health-watchdog`
```bash
# Restart nextcloud-aio-nextcloud if unhealthy for >1 hour
*/15 * * * * root docker inspect --format={{.State.Health.Status}} nextcloud-aio-nextcloud 2>/dev/null | grep -q unhealthy && [ "$(docker inspect --format={{.State.StartedAt}} nextcloud-aio-nextcloud | xargs -I{} date -d {} +\%s)" -lt "$(date -d "1 hour ago" +\%s)" ] && docker restart nextcloud-aio-nextcloud && logger -t nextcloud-watchdog "Restarted unhealthy nextcloud-aio-nextcloud"
```
- Runs every 15 minutes as root
- Only restarts if the container has been running for >1 hour (avoids interfering with normal startup)
- Logs to syslog as `nextcloud-watchdog` — check with `journalctl -t nextcloud-watchdog`
- Netdata will still fire the `docker_nextcloud_unhealthy` alert during the unhealthy window, but the outage is capped at ~1 hour instead of persisting until the next nightly cycle
## Also: Suppress `docker_container_down` for Normally-Exiting Containers
Nextcloud AIO runs `borgbackup` (scheduled backups) and `watchtower` (auto-updates) as containers that exit with code 0 after completing their work. The stock `docker_container_down` alarm fires on any exited container, generating false alerts after every nightly cycle.
Add a second override to the same file using `chart labels` to exclude them:
```ini
# Suppress docker_container_down for Nextcloud AIO containers that exit normally
# (borgbackup runs on schedule then exits; watchtower does updates then exits)
template: docker_container_down
on: docker.container_running_state
class: Errors
type: Containers
component: Docker
units: status
every: 30s
lookup: average -5m of down
chart labels: container_name=!nextcloud-aio-borgbackup !nextcloud-aio-watchtower *
warn: $this > 0
delay: up 3m down 5m multiplier 1.5 max 30m
summary: Docker container ${label:container_name} down
info: ${label:container_name} docker container is down
to: sysadmin
```
The `chart labels` line uses Netdata's simple pattern syntax — `!` prefix excludes a container, `*` matches everything else. All other exited containers still alert normally.
## Applying the Config
```bash
# If Netdata runs in Docker, write to the config volume
sudo tee /var/lib/docker/volumes/netdata_netdataconfig/_data/health.d/docker.conf > /dev/null << 'EOF'
# paste config here
EOF
# Reload health alarms without restarting the container
sudo docker exec netdata netdatacli reload-health
```
No container restart needed — `reload-health` picks up the new config immediately.
## Verify
In the Netdata UI, navigate to **Alerts → Manage Alerts** and search for `docker_container_unhealthy`. The lookup and delay values should reflect the new config.
## Notes
- Both `docker_container_unhealthy` and `docker_container_down` are overridden in this config. Any container not explicitly excluded in the `chart labels` filter will still alert normally.
- If you want per-container silencing instead of a blanket delay, use the `host labels` or `chart labels` filter to scope the alarm to specific containers.
- Config volume path on majorlab: `/var/lib/docker/volumes/netdata_netdataconfig/_data/`
## See Also
- [Tuning Netdata Web Log Alerts](tuning-netdata-web-log-alerts.md) — similar tuning for web_log redirect alerts

View File

@@ -0,0 +1,162 @@
---
title: "Netdata n8n Enriched Alert Emails"
domain: selfhosting
category: monitoring
tags: [netdata, n8n, alerts, email, monitoring, automation]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Netdata → n8n Enriched Alert Emails
**Status:** Live across all MajorsHouse fleet servers as of 2026-03-21
Replaces Netdata's plain-text alert emails with rich HTML emails that include a plain-English explanation, a suggested remediation command, and a direct link to the relevant MajorWiki article.
---
## How It Works
```
Netdata alarm fires
→ custom_sender() in health_alarm_notify.conf
→ POST JSON payload to n8n webhook
→ Code node enriches with suggestion + wiki link
→ Send Email node sends HTML email via SMTP
→ Respond node returns 200 OK
```
---
## n8n Workflow
**Name:** Netdata Enriched Alerts
**URL:** https://n8n.majorshouse.com
**Webhook endpoint:** `POST https://n8n.majorshouse.com/webhook/netdata-alert`
**Workflow ID:** `a1b2c3d4-aaaa-bbbb-cccc-000000000001`
### Nodes
1. **Netdata Webhook** — receives POST from Netdata's `custom_sender()`
2. **Enrich Alert** — Code node; matches alarm/chart/family to enrichment table, builds HTML email body in `$json.emailBody`
3. **Send Enriched Email** — sends via SMTP port 465 (SMTP account 2), from `netdata@majorshouse.com` to `marcus@majorshouse.com`
4. **Respond OK** — returns `ok` with HTTP 200 to Netdata
### Enrichment Keys
The Code node matches on `alarm`, `chart`, or `family` field (case-insensitive substring):
| Key | Title | Wiki Article |
|-----|-------|-------------|
| `disk_space` | Disk Space Alert | snapraid-mergerfs-setup |
| `ram` | Memory Alert | managing-linux-services-systemd-ansible |
| `cpu` | CPU Alert | managing-linux-services-systemd-ansible |
| `load` | Load Average Alert | managing-linux-services-systemd-ansible |
| `net` | Network Alert | tailscale-homelab-remote-access |
| `docker` | Docker Container Alert | debugging-broken-docker-containers |
| `web_log` | Web Log Alert | tuning-netdata-web-log-alerts |
| `health` | Docker Health Alarm | netdata-docker-health-alarm-tuning |
| `mdstat` | RAID Array Alert | mdadm-usb-hub-disconnect-recovery |
| `systemd` | Systemd Service Alert | docker-caddy-selinux-post-reboot-recovery |
| _(no match)_ | Server Alert | netdata-new-server-setup |
---
## Netdata Configuration
### Config File Locations
| Server | Path |
|--------|------|
| majorhome, majormail, majordiscord, tttpod, teelia | `/etc/netdata/health_alarm_notify.conf` |
| majorlinux, majortoot, dca | `/usr/lib/netdata/conf.d/health_alarm_notify.conf` |
### Required Settings
```bash
DEFAULT_RECIPIENT_CUSTOM="n8n"
role_recipients_custom[sysadmin]="${DEFAULT_RECIPIENT_CUSTOM}"
```
### custom_sender() Function
```bash
custom_sender() {
local to="${1}"
local payload
payload=$(jq -n \
--arg hostname "${host}" \
--arg alarm "${name}" \
--arg chart "${chart}" \
--arg family "${family}" \
--arg status "${status}" \
--arg old_status "${old_status}" \
--arg value "${value_string}" \
--arg units "${units}" \
--arg info "${info}" \
--arg alert_url "${goto_url}" \
--arg severity "${severity}" \
--arg raised_for "${raised_for}" \
--arg total_warnings "${total_warnings}" \
--arg total_critical "${total_critical}" \
'{hostname:$hostname,alarm:$alarm,chart:$chart,family:$family,status:$status,old_status:$old_status,value:$value,units:$units,info:$info,alert_url:$alert_url,severity:$severity,raised_for:$raised_for,total_warnings:$total_warnings,total_critical:$total_critical}')
local httpcode
httpcode=$(docurl -s -o /dev/null -w "%{http_code}" \
-X POST \
-H "Content-Type: application/json" \
-d "${payload}" \
"https://n8n.majorshouse.com/webhook/netdata-alert")
if [ "${httpcode}" = "200" ]; then
info "sent enriched notification to n8n for ${status} of ${host}.${name}"
sent=$((sent + 1))
else
error "failed to send notification to n8n, HTTP code: ${httpcode}"
fi
}
```
!!! note "jq required"
The `custom_sender()` function requires `jq` to be installed. Verify with `which jq` on each server.
---
## Deploying to a New Server
```bash
# 1. Find the config file
find /etc/netdata /usr/lib/netdata -name health_alarm_notify.conf 2>/dev/null
# 2. Edit it — add the two lines and the custom_sender() function above
# 3. Test connectivity from the server
curl -s -o /dev/null -w "%{http_code}" \
-X POST https://n8n.majorshouse.com/webhook/netdata-alert \
-H "Content-Type: application/json" \
-d '{"hostname":"test","alarm":"disk_space._","status":"WARNING"}'
# Expected: 200
# 4. Restart Netdata
systemctl restart netdata
# 5. Send a test alarm
/usr/libexec/netdata/plugins.d/alarm-notify.sh test custom
```
---
## Troubleshooting
**Emails not arriving — check n8n execution log:**
Go to https://n8n.majorshouse.com → open "Netdata Enriched Alerts" → Executions tab. Look for `error` status entries.
**Email body empty:**
The Send Email node's HTML field must be `={{ $json.emailBody }}`. Shell variable expansion can silently strip `$json` if the workflow is patched via inline SSH commands — always use a Python script file.
**`000` curl response from a server:**
Usually a timeout, not a DNS or connection failure. Re-test with `--max-time 30`.
**`custom_sender()` syntax error in Netdata logs:**
Bash heredocs don't work inside sourced config files. Use `jq -n --arg ...` as shown above — no heredocs.
**n8n `N8N_TRUST_PROXY` must be set:**
Without `N8N_TRUST_PROXY=true` in the Docker environment, Caddy's `X-Forwarded-For` header causes n8n's rate limiter to abort requests before parsing the body. Set in `/opt/n8n/compose.yml`.

View File

@@ -0,0 +1,161 @@
---
title: "Deploying Netdata to a New Server"
domain: selfhosting
category: monitoring
tags: [netdata, monitoring, email, notifications, netdata-cloud, ubuntu, debian, n8n]
status: published
created: 2026-03-18
updated: 2026-03-22
---
# Deploying Netdata to a New Server
This covers the full Netdata setup for a new server in the fleet: install, email notification config, n8n webhook integration, and Netdata Cloud claim. Applies to Ubuntu/Debian servers.
## 1. Install Prerequisites
Install `jq` before anything else. It is required by the `custom_sender()` function in `health_alarm_notify.conf` to build the JSON payload sent to the n8n webhook. **If `jq` is missing, the webhook will fire with an empty body and n8n alert emails will have no information in them.**
```bash
apt install -y jq
```
Verify:
```bash
jq --version
```
## 2. Install Netdata
Use the official kickstart script:
```bash
wget -O /tmp/netdata-install.sh https://get.netdata.cloud/kickstart.sh
sh /tmp/netdata-install.sh --non-interactive --stable-channel --disable-telemetry
```
Verify it's running:
```bash
systemctl is-active netdata
curl -s http://localhost:19999/api/v1/info | python3 -c "import sys,json; d=json.load(sys.stdin); print('Netdata', d['version'])"
```
## 3. Configure Email Notifications
Copy the default config and set the three required values:
```bash
cp /usr/lib/netdata/conf.d/health_alarm_notify.conf /etc/netdata/health_alarm_notify.conf
```
Edit `/etc/netdata/health_alarm_notify.conf`:
```ini
EMAIL_SENDER="netdata@majorshouse.com"
SEND_EMAIL="YES"
DEFAULT_RECIPIENT_EMAIL="marcus@majorshouse.com"
```
Or apply with `sed` in one shot:
```bash
sed -i 's/^#\?EMAIL_SENDER=.*/EMAIL_SENDER="netdata@majorshouse.com"/' /etc/netdata/health_alarm_notify.conf
sed -i 's/^#\?SEND_EMAIL=.*/SEND_EMAIL="YES"/' /etc/netdata/health_alarm_notify.conf
sed -i 's/^#\?DEFAULT_RECIPIENT_EMAIL=.*/DEFAULT_RECIPIENT_EMAIL="marcus@majorshouse.com"/' /etc/netdata/health_alarm_notify.conf
```
Restart and test:
```bash
systemctl restart netdata
/usr/libexec/netdata/plugins.d/alarm-notify.sh test 2>&1 | grep -E '(OK|FAILED|email)'
```
You should see three `# OK` lines (WARNING → CRITICAL → CLEAR test cycle) and confirmation that email was sent to `marcus@majorshouse.com`.
> [!note] Delivery via local Postfix
> Email is relayed through the server's local Postfix instance. Ensure Postfix is installed and `/usr/sbin/sendmail` resolves.
## 4. Configure n8n Webhook Notifications
Copy the `health_alarm_notify.conf` from an existing server (e.g. majormail) which contains the `custom_sender()` function. This sends enriched JSON payloads to the n8n webhook at `https://n8n.majorshouse.com/webhook/netdata-alert`.
> [!warning] jq required
> The `custom_sender()` function uses `jq` to build the JSON payload. If `jq` is not installed, `payload` will be empty, curl will send `Content-Length: 0`, and n8n will produce alert emails with `Host: unknown`, blank alert/value fields, and `Status: UNKNOWN`. Always install `jq` first (Step 1).
After deploying the config, run a test to confirm the webhook fires correctly:
```bash
systemctl restart netdata
/usr/libexec/netdata/plugins.d/alarm-notify.sh test 2>&1 | grep -E '(custom|n8n|OK|FAILED)'
```
Verify in n8n that the latest execution shows a non-empty body with `hostname`, `alarm`, and `status` fields populated.
## 5. Claim to Netdata Cloud
Get the claim command from **Netdata Cloud → Space Settings → Nodes → Add Nodes**. It will look like:
```bash
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh
sh /tmp/netdata-kickstart.sh --stable-channel \
--claim-token <token> \
--claim-rooms <room-id> \
--claim-url https://app.netdata.cloud
```
Verify the claim was accepted:
```bash
cat /var/lib/netdata/cloud.d/claimed_id
```
A UUID will be present if claimed successfully. The node should appear in Netdata Cloud within ~60 seconds.
## 6. Verify Alerts
Check that no unexpected alerts are active after setup:
```bash
curl -s 'http://localhost:19999/api/v1/alarms?active' | python3 -c "
import sys, json
d = json.load(sys.stdin)
active = [v for v in d.get('alarms', {}).values() if v.get('status') not in ('CLEAR', 'UNINITIALIZED', 'UNDEFINED')]
print(f'{len(active)} active alert(s)')
for v in active:
print(f' [{v[\"status\"]}] {v[\"name\"]} on {v[\"chart\"]}')
"
```
## Fleet-wide Alert Check
To audit all servers at once (requires Tailscale SSH access):
```bash
for host in majorlab majorhome majormail majordiscord majortoot majorlinux tttpod dca teelia; do
echo "=== $host ==="
ssh root@$host "curl -s 'http://localhost:19999/api/v1/alarms?active' | python3 -c \
\"import sys,json; d=json.load(sys.stdin); active=[v for v in d.get('alarms',{}).values() if v.get('status') not in ('CLEAR','UNINITIALIZED','UNDEFINED')]; print(str(len(active))+' active')\""
done
```
## Fleet-wide jq Audit
To check that all servers with `custom_sender` have `jq` installed:
```bash
for host in majorlab majorhome majormail majordiscord majortoot majorlinux tttpod dca teelia; do
echo -n "=== $host: "
ssh -o ConnectTimeout=5 root@$host \
'has_cs=$(grep -l "custom_sender\|n8n.majorshouse.com" /etc/netdata/health_alarm_notify.conf 2>/dev/null | wc -l); has_jq=$(which jq 2>/dev/null && echo yes || echo NO); echo "custom_sender=$has_cs jq=$has_jq"'
done
```
Any server showing `custom_sender=1 jq=NO` needs `apt install -y jq` immediately.
## Related
- [Tuning Netdata Web Log Alerts](tuning-netdata-web-log-alerts.md)
- [Tuning Netdata Docker Health Alarms](netdata-docker-health-alarm-tuning.md)

View File

@@ -0,0 +1,137 @@
---
title: "Netdata SELinux AVC Denial Monitoring"
domain: selfhosting
category: monitoring
tags: [netdata, selinux, fedora, monitoring, ausearch, charts.d]
status: published
created: 2026-03-27
updated: 2026-03-27
---
# Netdata SELinux AVC Denial Monitoring
A custom `charts.d` plugin that tracks SELinux AVC denials over time via Netdata. Deployed on all Fedora boxes in the fleet where SELinux is Enforcing.
## What It Does
The plugin runs `ausearch -m avc` every 60 seconds and reports the count of AVC denial events from the last 10 minutes. This gives a real-time chart in Netdata Cloud showing SELinux denial spikes — useful for catching misconfigurations after service changes or package updates.
## Where It's Deployed
| Host | OS | SELinux | Chart Installed |
|------|----|---------|-----------------|
| majorhome | Fedora 43 | Enforcing | Yes |
| majorlab | Fedora 43 | Enforcing | Yes |
| majormail | Fedora 43 | Enforcing | Yes |
| majordiscord | Fedora 43 | Enforcing | Yes |
Ubuntu hosts (dca, teelia, tttpod, majortoot, majorlinux) do not run SELinux and do not have this chart.
## Installation
### 1. Create the Chart Plugin
Create `/etc/netdata/charts.d/selinux.chart.sh`:
```bash
cat > /etc/netdata/charts.d/selinux.chart.sh << 'EOF'
# SELinux AVC denial counter for Netdata charts.d
selinux_update_every=60
selinux_priority=90000
selinux_check() {
which ausearch >/dev/null 2>&1 || return 1
return 0
}
selinux_create() {
cat <<CHART
CHART selinux.avc_denials '' 'SELinux AVC Denials (last 10 min)' 'denials' selinux '' line 90000 $selinux_update_every ''
DIMENSION denials '' absolute 1 1
CHART
return 0
}
selinux_update() {
local count
count=$(sudo /usr/bin/ausearch -m avc -if /var/log/audit/audit.log -ts recent 2>/dev/null | grep -c "type=AVC")
echo "BEGIN selinux.avc_denials $1"
echo "SET denials = ${count}"
echo "END"
return 0
}
EOF
```
### 2. Grant Netdata Sudo Access to ausearch
`ausearch` requires root to read the audit log. Add a sudoers entry for the `netdata` user:
```bash
echo 'netdata ALL=(root) NOPASSWD: /usr/bin/ausearch -m avc -if /var/log/audit/audit.log -ts recent' > /etc/sudoers.d/netdata-selinux
chmod 440 /etc/sudoers.d/netdata-selinux
visudo -c
```
The `visudo -c` validates syntax. If it reports errors, fix the file before proceeding — a broken sudoers file can lock out sudo entirely.
### 3. Restart Netdata
```bash
systemctl restart netdata
```
### 4. Verify
Check that the chart is collecting data:
```bash
curl -s 'http://localhost:19999/api/v1/chart?chart=selinux.avc_denials' | python3 -c "
import sys, json
d = json.load(sys.stdin)
print(f'Chart: {d[\"id\"]}')
print(f'Update every: {d[\"update_every\"]}s')
print(f'Type: {d[\"chart_type\"]}')
"
```
If the chart doesn't appear, check that `charts.d` is enabled in `/etc/netdata/netdata.conf` and that the plugin file is readable by the `netdata` user.
## Known Side Effect: pam_systemd Log Noise
Because the `netdata` user calls `sudo ausearch` every 60 seconds, `pam_systemd` logs a warning each time:
```
pam_systemd(sudo:session): Failed to check if /run/user/0/bus exists, ignoring: Permission denied
```
This is cosmetic. The `sudo` command succeeds — `pam_systemd` just can't find a D-Bus user session for the `netdata` service account, which is expected. The message volume scales with the collection interval (1,440/day at 60-second intervals).
**To suppress it**, the `system-auth` PAM config on Fedora already marks `pam_systemd.so` as `-session optional` (the `-` prefix means "don't fail if the module errors"). The messages are informational log noise, not actual failures. No PAM changes are needed.
If the log volume is a concern for log analysis or monitoring, filter it at the journald level:
```ini
# /etc/rsyslog.d/suppress-pam-systemd.conf
:msg, contains, "pam_systemd(sudo:session): Failed to check" stop
```
Or in Netdata's log alert config, exclude the pattern from any log-based alerts.
## Fleet Audit
To verify the chart is deployed and functioning on all Fedora hosts:
```bash
for host in majorhome majorlab majormail majordiscord; do
echo -n "=== $host: "
ssh root@$host "curl -s 'http://localhost:19999/api/v1/chart?chart=selinux.avc_denials' 2>/dev/null | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d[\"id\"], \"every\", str(d[\"update_every\"])+\"s\")' 2>/dev/null || echo 'NOT FOUND'"
done
```
## Related
- [Deploying Netdata to a New Server](netdata-new-server-setup.md)
- [Tuning Netdata Web Log Alerts](tuning-netdata-web-log-alerts.md)
- [Tuning Netdata Docker Health Alarms](netdata-docker-health-alarm-tuning.md)
- [SELinux: Fixing Dovecot Mail Spool Context](../../05-troubleshooting/selinux-dovecot-vmail-context.md)

View File

@@ -85,4 +85,4 @@ curl -s http://localhost:19999/api/v1/alarms?all | grep -A 15 "web_log_1m_redire
## See Also
- [[Netdata service monitoring]]
- Netdata service monitoring

View File

@@ -135,6 +135,6 @@ yourdomain.com {
## See Also
- [[self-hosting-starter-guide]]
- [[linux-server-hardening-checklist]]
- [[debugging-broken-docker-containers]]
- [self-hosting-starter-guide](../docker/self-hosting-starter-guide.md)
- [linux-server-hardening-checklist](../security/linux-server-hardening-checklist.md)
- [debugging-broken-docker-containers](../docker/debugging-broken-docker-containers.md)

View File

@@ -0,0 +1,94 @@
---
title: Standardizing unattended-upgrades Across Ubuntu Fleet with Ansible
domain: selfhosting
category: security
tags:
- ansible
- ubuntu
- apt
- unattended-upgrades
- fleet-management
status: published
created: '2026-03-16'
updated: '2026-03-16'
---
# Standardizing unattended-upgrades Across Ubuntu Fleet with Ansible
When some Ubuntu hosts in a fleet self-update via `unattended-upgrades` and others don't, they drift apart over time — different kernel versions, different reboot states, inconsistent behavior. This article covers how to diagnose the drift and enforce uniform auto-update config across all Ubuntu hosts using Ansible.
## Diagnosing the Problem
If only some Ubuntu hosts are flagging for reboot, check:
```bash
# What triggered the reboot flag?
cat /var/run/reboot-required.pkgs
# Is unattended-upgrades installed and active?
systemctl status unattended-upgrades
cat /etc/apt/apt.conf.d/20auto-upgrades
# When did apt last run?
ls -lt /var/log/apt/history.log*
```
The reboot flag is written to `/var/run/reboot-required` by `update-notifier-common` when packages like the kernel, glibc, or systemd are updated. If some hosts have `unattended-upgrades` running and others don't, the ones that self-updated will flag for reboot while the others lag behind.
## The Fix — Ansible Playbook
Add these tasks to your update playbook **before** the apt cache update step:
```yaml
- name: Ensure unattended-upgrades is installed on Ubuntu servers
ansible.builtin.apt:
name:
- unattended-upgrades
- update-notifier-common
state: present
update_cache: true
when: ansible_facts['os_family'] == "Debian"
- name: Enforce uniform auto-update config on Ubuntu servers
ansible.builtin.copy:
dest: /etc/apt/apt.conf.d/20auto-upgrades
content: |
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
owner: root
group: root
mode: '0644'
when: ansible_facts['os_family'] == "Debian"
- name: Ensure unattended-upgrades service is enabled and running
ansible.builtin.systemd:
name: unattended-upgrades
enabled: true
state: started
when: ansible_facts['os_family'] == "Debian"
```
Running this across the `ubuntu` group ensures every host has the same config on every Ansible run — idempotent and safe.
## Rebooting Flagged Hosts
Once identified, reboot specific hosts without touching the rest:
```bash
# Reboot just the flagging hosts
ansible-playbook reboot.yml -l teelia,tttpod
# Run full update on remaining hosts to bring them up to the same kernel
ansible-playbook update.yml -l dca,majorlinux,majortoot
```
## Notes
- `unattended-upgrades` runs daily on its own schedule — hosts that haven't checked yet will lag behind but catch up within 24 hours
- The other hosts showing `ok` (not `changed`) on the config tasks means they were already correctly configured
- After a kernel update is pulled, only an actual reboot clears the `/var/run/reboot-required` flag — Ansible reporting the flag is informational only
## See Also
- [Ansible Getting Started](../../01-linux/shell-scripting/ansible-getting-started.md)
- [Linux Server Hardening Checklist](linux-server-hardening-checklist.md)

View File

@@ -0,0 +1,127 @@
---
title: "Fail2ban Custom Jail: Apache 404 Scanner Detection"
domain: selfhosting
category: security
tags: [fail2ban, apache, security, scanner, firewall]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Fail2ban Custom Jail: Apache 404 Scanner Detection
## The Problem
Automated vulnerability scanners probe web servers by requesting dozens of common config file paths — `.env`, `env.php`, `next.config.js`, `nuxt.config.ts`, etc. — in rapid succession. These all return **404 Not Found**, which is correct behavior from Apache.
However, the built-in Fail2ban jails (`apache-noscript`, `apache-botsearch`) don't catch these because they parse the **error log**, not the **access log**. If Apache doesn't write a corresponding "File does not exist" entry to the error log for every 404, the scanner slips through undetected.
This also triggers false alerts in monitoring tools like **Netdata**, which sees the success ratio drop (e.g., `web_log_1m_successful` goes CRITICAL at 2.83%) because 404s aren't counted as successful responses.
## The Solution
Create a custom Fail2ban filter that reads the **access log** and matches 404 responses directly.
### Step 1 — Create the filter
Create `/etc/fail2ban/filter.d/apache-404scan.conf`:
```ini
# Fail2Ban filter to catch rapid 404 scanning in Apache access logs
# Targets vulnerability scanners probing for .env, config files, etc.
[Definition]
# Match 404 responses in combined/common access log format
failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE|OPTIONS|PATCH) .+" 404 \d+
ignoreregex = ^<HOST> -.*(robots\.txt|favicon\.ico|apple-touch-icon)
datepattern = %%d/%%b/%%Y:%%H:%%M:%%S %%z
```
### Step 2 — Add the jail
Add to `/etc/fail2ban/jail.local`:
```ini
[apache-404scan]
enabled = true
port = http,https
filter = apache-404scan
logpath = /var/log/apache2/access.log
maxretry = 10
findtime = 1m
bantime = 24h
backend = polling
```
**10 hits in 1 minute** is aggressive enough to catch scanners (which fire 3050+ requests in seconds) while avoiding false positives from a legitimate user hitting a few broken links.
> **Critical: `backend = polling` is required** if your `jail.local` or `jail.d/` sets `backend = systemd` in `[DEFAULT]` (common on Fedora/RHEL). Without it, fail2ban ignores the `logpath` and reads from journald instead — which Apache doesn't write to. The jail will appear active (`fail2ban-client status` shows it running) but `fail2ban-client get apache-404scan logpath` will return "No file is currently monitored" and zero IPs will ever be banned. This fails silently.
### Step 3 — Test the regex
```bash
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
```
You should see matches. In a real-world test against a server under active scanning, this matched **2831 out of 8901** access log lines.
### Step 4 — Reload Fail2ban
```bash
systemctl restart fail2ban
fail2ban-client status apache-404scan
```
## Why Default Jails Miss This
| Jail | Log Source | What It Matches | Why It Misses |
|---|---|---|---|
| `apache-noscript` | error log | "script not found or unable to stat" | Only matches script-type files (.php, .asp, .exe, .pl) |
| `apache-botsearch` | error log | "File does not exist" for specific paths | Requires Apache to write error log entries for 404s |
| **`apache-404scan`** | **access log** | **Any 404 response** | **Catches everything** |
The key insight: URL-encoded probes like `/%2f%2eenv%2econfig` that return 404 in the access log may not generate error log entries at all, making them invisible to the default filters.
## Pair With Recidive
If you have the `recidive` jail enabled, repeat offenders get permanently banned:
```ini
[recidive]
enabled = true
bantime = -1
findtime = 86400
maxretry = 3
```
Three 24-hour bans within a day = permanent firewall block.
## Quick Diagnostic Commands
```bash
# Test filter against current access log
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
# Check jail status and banned IPs
fail2ban-client status apache-404scan
# IMPORTANT: verify the jail is actually monitoring the file
fail2ban-client get apache-404scan logpath
# Should show: /var/log/apache2/access.log
# If it shows "No file is currently monitored" — add backend = polling to the jail
# Watch bans in real time
tail -f /var/log/fail2ban.log | grep apache-404scan
# Count 404s in today's log
grep '" 404 ' /var/log/apache2/access.log | wc -l
```
## Key Notes
- The `ignoreregex` excludes `robots.txt`, `favicon.ico`, and `apple-touch-icon` — these are commonly requested and produce harmless 404s.
- Make sure your Tailscale subnet (`100.64.0.0/10`) is in the `ignoreip` list under `[DEFAULT]` so you don't ban your own monitoring or uptime checks.
- This filter works with both Apache **combined** and **common** log formats.
- Complements the existing `apache-dirscan` jail (which catches error-log-based directory enumeration). Use both for full coverage.

View File

@@ -0,0 +1,131 @@
---
title: "Fail2ban Custom Jail: WordPress Login Brute Force"
domain: selfhosting
category: security
tags: [fail2ban, wordpress, apache, security, brute-force]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Fail2ban Custom Jail: WordPress Login Brute Force
## The Problem
WordPress login brute force attacks are extremely common. Bots hammer `/wp-login.php` with POST requests, cycling through common credentials. The default Fail2ban `apache-auth` jail doesn't catch these because WordPress returns **HTTP 200** on failed logins — not 401 — so nothing appears as an authentication failure in the Apache error log.
There are pre-packaged filters (`wordpress-hard.conf`, `wordpress-soft.conf`) that ship with some Fail2ban installations, but these require the **[WP fail2ban](https://wordpress.org/plugins/wp-fail2ban/)** WordPress plugin to be installed. That plugin writes login failures to syslog, which the filters then match. Without the plugin, those filters do nothing.
## The Solution
Create a lightweight filter that reads the **Apache access log** and matches repeated POST requests to `wp-login.php` directly. No WordPress plugin needed.
### Step 1 — Create the filter
Create `/etc/fail2ban/filter.d/wordpress-login.conf`:
```ini
# Fail2Ban filter for WordPress login brute force
# Matches POST requests to wp-login.php in Apache access log
[Definition]
failregex = ^<HOST> .* "POST /wp-login\.php
ignoreregex =
```
### Step 2 — Add the jail
Add to `/etc/fail2ban/jail.local`:
```ini
[wordpress-login]
enabled = true
port = http,https
filter = wordpress-login
logpath = /var/log/apache2/access.log
maxretry = 5
findtime = 60
bantime = 30d
backend = polling
```
**5 attempts in 60 seconds** is tight enough to catch bots (which fire hundreds of requests per minute) while giving a real human a reasonable margin for typos.
> **Critical: `backend = polling` is required** on Ubuntu 24.04 and other systemd-based distros where `backend = auto` defaults to `systemd`. Without it, Fail2ban ignores `logpath` and reads from journald, which Apache doesn't write to. The jail silently monitors nothing. See [[fail2ban-apache-404-scanner-jail]] for more detail on this gotcha.
### Step 3 — Test the regex
```bash
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/wordpress-login.conf
```
In a real-world test against an active brute force (3 IPs, ~1,700 hits each), this matched **5,178 lines**.
### Step 4 — Reload and verify
```bash
systemctl restart fail2ban
fail2ban-client status wordpress-login
```
### Manually banning known attackers
If you've already identified brute-force IPs from the logs, ban them immediately rather than waiting for new hits:
```bash
# Find top offenders
grep "POST /wp-login.php" /var/log/apache2/access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -10
# Ban them
fail2ban-client set wordpress-login banip <IP>
```
## Why Default Jails Miss This
| Jail | Log Source | What It Matches | Why It Misses |
|---|---|---|---|
| `apache-auth` | error log | 401 authentication failures | WordPress returns 200, not 401 |
| `wordpress-hard` | syslog | WP fail2ban plugin messages | Requires plugin installation |
| `wordpress-soft` | syslog | WP fail2ban plugin messages | Requires plugin installation |
| **`wordpress-login`** | **access log** | **POST to wp-login.php** | **No plugin needed** |
## Optional: Extend to XML-RPC
WordPress's `xmlrpc.php` is another common brute-force target. To cover both, update the filter:
```ini
failregex = ^<HOST> .* "POST /wp-login\.php
^<HOST> .* "POST /xmlrpc\.php
```
## Quick Diagnostic Commands
```bash
# Test filter against current access log
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/wordpress-login.conf
# Check jail status and banned IPs
fail2ban-client status wordpress-login
# Verify the jail is reading the correct file
fail2ban-client get wordpress-login logpath
# Count wp-login POSTs in today's log
grep "POST /wp-login.php" /var/log/apache2/access.log | wc -l
# Watch bans in real time
tail -f /var/log/fail2ban.log | grep wordpress-login
```
## Key Notes
- This filter works with both Apache **combined** and **common** log formats.
- Make sure your Tailscale subnet (`100.64.0.0/10`) is in the `ignoreip` list under `[DEFAULT]` so legitimate admin access isn't banned.
- The `recidive` jail (if enabled) will escalate repeat offenders — three 30-day bans within a day triggers a 90-day block.
- Complements the [[fail2ban-apache-404-scanner-jail|Apache 404 Scanner Jail]] for full access-log coverage.
## See Also
- [[fail2ban-apache-404-scanner-jail]] — catches vulnerability scanners via 404 floods
- [[tuning-netdata-web-log-alerts]] — suppress false Netdata alerts from normal HTTP traffic

View File

@@ -194,6 +194,38 @@ sudo systemctl disable --now servicename
Common ones to disable on a dedicated server: `avahi-daemon`, `cups`, `bluetooth`.
## 8. Mail Server: SpamAssassin
If you're running Postfix (like on majormail), SpamAssassin filters incoming spam before it hits your mailbox.
**Install (Fedora/RHEL):**
```bash
sudo dnf install spamassassin
sudo systemctl enable --now spamassassin
```
**Integrate with Postfix** by adding a content filter in `/etc/postfix/master.cf`. See the [full setup guide](https://www.davekb.com/browse_computer_tips:spamassassin_with_postfix:txt) for Postfix integration on RedHat-based systems.
**Train the filter with sa-learn:**
SpamAssassin gets better when you feed it examples of spam and ham (legitimate mail):
```bash
# Train on known spam
sa-learn --spam /path/to/spam-folder/
# Train on known good mail
sa-learn --ham /path/to/ham-folder/
# Check what sa-learn knows
sa-learn --dump magic
```
Run `sa-learn` periodically against your Maildir to keep the Bayesian filter accurate. The more examples it sees, the fewer false positives and missed spam you'll get.
Reference: [sa-learn documentation](https://spamassassin.apache.org/full/3.0.x/dist/doc/sa-learn.html)
## Gotchas & Notes
- **Don't lock yourself out.** Test SSH key auth in a second terminal before disabling passwords. Keep the original session open.
@@ -204,5 +236,5 @@ Common ones to disable on a dedicated server: `avahi-daemon`, `cups`, `bluetooth
## See Also
- [[managing-linux-services-systemd-ansible]]
- [[debugging-broken-docker-containers]]
- [managing-linux-services-systemd-ansible](../../01-linux/process-management/managing-linux-services-systemd-ansible.md)
- [debugging-broken-docker-containers](../docker/debugging-broken-docker-containers.md)

View File

@@ -0,0 +1,95 @@
---
title: "SELinux: Fixing Fail2ban grep execmem Denial on Fedora"
domain: selfhosting
category: security
tags: [selinux, fail2ban, fedora, execmem, security]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# SELinux: Fixing Fail2ban grep execmem Denial on Fedora
## The Problem
After a reboot on Fedora 43, Netdata fires a `selinux_avc_denials` WARNING alert. The audit log shows:
```
avc: denied { execmem } for comm="grep"
scontext=system_u:system_r:fail2ban_t:s0
tcontext=system_u:system_r:fail2ban_t:s0
tclass=process permissive=0
```
Fail2ban spawns `grep` to scan log files when its jails start. SELinux denies `execmem` (executable memory) for processes running in the `fail2ban_t` domain. The `fail2ban-selinux` package does not include this permission.
## Impact
- Fail2ban still functions — the denial affects grep's memory allocation strategy, not its ability to run
- Netdata will keep alerting on every reboot (fail2ban restarts and triggers the denial)
- No security risk — this is fail2ban's own grep subprocess, not an external exploit
## The Fix
Create a targeted SELinux policy module that allows `execmem` for `fail2ban_t`:
```bash
cd /tmp
cat > my-fail2ban-grep.te << "EOF"
module my-fail2ban-grep 1.0;
require {
type fail2ban_t;
class process execmem;
}
allow fail2ban_t self:process execmem;
EOF
# Compile the module
checkmodule -M -m -o my-fail2ban-grep.mod my-fail2ban-grep.te
# Package it
semodule_package -o my-fail2ban-grep.pp -m my-fail2ban-grep.mod
# Install at priority 300 (above default policy)
semodule -X 300 -i my-fail2ban-grep.pp
```
## Verifying
Confirm the module is loaded:
```bash
semodule -l | grep fail2ban-grep
# Expected: my-fail2ban-grep
```
Check that no new AVC denials appear after restarting fail2ban:
```bash
systemctl restart fail2ban
ausearch -m avc --start recent | grep fail2ban
# Expected: no output (no new denials)
```
## Why Not `audit2allow` Directly?
The common shortcut `ausearch -c grep --raw | audit2allow -M my-policy` can fail if:
- The AVC events have already rotated out of the audit log
- `ausearch` returns no matching records (outputs "Nothing to do")
Writing the `.te` file manually is more reliable and self-documenting.
## Environment
- **OS:** Fedora 43
- **SELinux:** Enforcing, targeted policy
- **Fail2ban:** 1.1.0 (`fail2ban-selinux-1.1.0-15.fc43.noarch`)
- **Kernel:** 6.19.x
## See Also
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](../../05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) — another SELinux fix for post-reboot service issues
- [SELinux: Fixing Dovecot Mail Spool Context](../../05-troubleshooting/selinux-dovecot-vmail-context.md) — custom SELinux context for mail spool

View File

@@ -0,0 +1,192 @@
---
title: "UFW Firewall Management"
domain: selfhosting
category: security
tags: [security, firewall, ufw, ubuntu, networking]
status: published
created: 2026-04-02
updated: 2026-04-03
---
# UFW Firewall Management
UFW (Uncomplicated Firewall) is the standard firewall tool on Ubuntu. It wraps iptables/nftables into something you can actually manage without losing your mind. This covers the syntax and patterns I use across the MajorsHouse fleet.
## The Short Answer
```bash
# Enable UFW
sudo ufw enable
# Allow a port
sudo ufw allow 80
# Block a specific IP
sudo ufw insert 1 deny from 203.0.113.50
# Check status
sudo ufw status numbered
```
## Basic Rules
### Allow by Port
```bash
# Allow HTTP and HTTPS
sudo ufw allow 80
sudo ufw allow 443
# Allow a port range
sudo ufw allow 6000:6010/tcp
# Allow a named application profile
sudo ufw allow 'Apache Full'
```
### Allow by Interface
Useful when you only want traffic on a specific network interface — this is how SSH is restricted to Tailscale across the fleet:
```bash
# Allow SSH only on the Tailscale interface
sudo ufw allow in on tailscale0 to any port 22
# Then deny SSH globally (evaluated after the allow above)
sudo ufw deny 22
```
Rule order matters. UFW evaluates rules top to bottom and stops at the first match.
### Allow by Source IP
```bash
# Allow a specific IP to access SSH
sudo ufw allow from 100.86.14.126 to any port 22
# Allow a subnet
sudo ufw allow from 192.168.50.0/24 to any port 22
```
## Blocking IPs
### Insert Rules at the Top
When blocking IPs, use `insert 1` to place the deny rule at the top of the chain. Otherwise it may never be evaluated because an earlier ALLOW rule matches first.
```bash
# Block a single IP
sudo ufw insert 1 deny from 203.0.113.50
# Block a subnet
sudo ufw insert 1 deny from 203.0.113.0/24
# Block an IP from a specific port only
sudo ufw insert 1 deny from 203.0.113.50 to any port 443
```
### Don't Accumulate Manual Blocks
Manual `ufw deny` rules pile up fast. On one of my servers, I found **30,142 manual DENY rules** — a 3 MB rules file that every packet had to traverse. Use Fail2ban for automated blocking instead. It manages bans with expiry and doesn't pollute your UFW rules.
If you inherit a server with thousands of manual blocks:
```bash
# Nuclear option — reset and re-add only the rules you need
sudo ufw --force reset
sudo ufw allow 'Apache Full'
sudo ufw allow in on tailscale0 to any port 22
sudo ufw deny 22
sudo ufw enable
```
## Managing Rules
### View Rules
```bash
# Simple view
sudo ufw status
# Numbered (needed for deletion and insert position)
sudo ufw status numbered
# Verbose (shows default policies and logging)
sudo ufw status verbose
```
### Delete Rules
```bash
# Delete by rule number
sudo ufw delete 3
# Delete by rule specification
sudo ufw delete allow 8080
```
### Default Policies
```bash
# Deny all incoming, allow all outgoing (recommended baseline)
sudo ufw default deny incoming
sudo ufw default allow outgoing
```
## Don't Forget Web Server Ports
If you're running a web server behind UFW, make sure ports 80 and 443 are explicitly allowed. This sounds obvious, but it's easy to miss — especially on servers where UFW was enabled after the web server was already running, or where a firewall reset dropped rules that were never persisted.
```bash
# Allow HTTP and HTTPS
sudo ufw allow 80
sudo ufw allow 443
# Or use an application profile
sudo ufw allow 'Apache Full'
```
If your site suddenly stops responding after enabling UFW or resetting rules, check `sudo ufw status numbered` first. Missing web ports is the most common cause.
## UFW with Fail2ban
On Ubuntu servers, Fail2ban and UFW operate at different layers. Fail2ban typically creates its own nftables table (`inet f2b-table`) at a higher priority than UFW's chains. This means:
- Fail2ban bans take effect **before** UFW rules are evaluated
- A banned IP is rejected even if UFW has an ALLOW rule for that port
- Add trusted IPs (your own, monitoring, etc.) to `ignoreip` in `/etc/fail2ban/jail.local` to prevent self-lockout
```ini
# /etc/fail2ban/jail.local
[DEFAULT]
ignoreip = 127.0.0.1/8 ::1 100.0.0.0/8
```
The `100.0.0.0/8` range covers all Tailscale IPs, which prevents banning fleet traffic.
## UFW Logging
```bash
# Enable logging (low/medium/high/full)
sudo ufw logging medium
```
Logs go to `/var/log/ufw.log`. Useful for seeing what's getting blocked, but `medium` or `low` is usually enough — `high` and `full` can be noisy.
## Fleet Reference
UFW is used on these MajorsHouse servers:
| Host | Key UFW Rules |
|---|---|
| majortoot | SSH on tailscale0, deny 22 globally |
| majorlinux | SSH on tailscale0, deny 22 globally |
| tttpod | SSH on tailscale0, deny 22 globally, Apache Full (added 2026-04-03) |
| teelia | SSH on tailscale0, deny 22 globally, Apache Full |
The Fedora servers (majorlab, majorhome, majormail, majordiscord) use iptables or firewalld instead.
## See Also
- [Linux Server Hardening Checklist](linux-server-hardening-checklist.md) — initial firewall setup as part of server provisioning
- [Fail2ban & UFW Rule Bloat Cleanup](../../05-troubleshooting/networking/fail2ban-ufw-rule-bloat-cleanup.md) — what happens when manual blocks get out of hand

View File

@@ -0,0 +1,68 @@
---
title: "Mastodon Instance Tuning"
domain: selfhosting
category: services
tags: [mastodon, fediverse, self-hosting, majortoot, docker]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Mastodon Instance Tuning
Running your own Mastodon instance means you control the rules — including limits the upstream project imposes by default. These are the tweaks applied to **majortoot** (MajorsHouse's Mastodon instance).
## Increase Character Limit
Mastodon's default 500-character post limit is low for longer-form thoughts. You can raise it, but it requires modifying the source — there's no config toggle.
The process depends on your deployment method (Docker vs bare metal) and Mastodon version. The community-maintained guide covers the approaches:
- [How to increase the max number of characters of a post](https://qa.mastoadmin.social/questions/10010000000000011/how-do-i-increase-the-max-number-of-characters-of-a-post)
**Key points:**
- The limit is enforced in both the backend (Ruby) and frontend (React). Both must be changed or the UI will reject posts the API would accept.
- After changing, you need to rebuild assets and restart services.
- Other instances will still display the full post — the character limit is per-instance, not a federation constraint.
- Some Mastodon forks (Glitch, Hometown) expose this as a config option without source patches.
## Media Cache Management
Federated content (avatars, headers, media from remote posts) gets cached locally. On a small instance this grows slowly, but over months it adds up — especially if you follow active accounts on large instances.
Reference: [Fedicache — Understanding Mastodon's media cache](https://notes.neatnik.net/2024/08/fedicache)
**Clean up cached remote media:**
```bash
# Preview what would be removed (older than 7 days)
tootctl media remove --days 7 --dry-run
# Actually remove it
tootctl media remove --days 7
# For Docker deployments
docker exec mastodon-web tootctl media remove --days 7
```
**Automate with cron or systemd timer:**
```bash
# Weekly cache cleanup — crontab
0 3 * * 0 docker exec mastodon-web tootctl media remove --days 7
```
**What gets removed:** Only cached copies of remote media. Local uploads (your posts, your users' posts) are never touched. Remote media will be re-fetched on demand if someone views the post again.
**Storage impact:** On a single-user instance, remote media cache can still reach several GB over a few months of active federation. Regular cleanup keeps disk usage predictable.
## Gotchas & Notes
- **Character limit changes break on upgrades.** Any source patch gets overwritten when you pull a new Mastodon release. Track your changes and reapply after updates.
- **`tootctl` is your admin CLI.** It handles media cleanup, user management, federation diagnostics, and more. Run `tootctl --help` for the full list.
- **Monitor disk usage.** Even with cache cleanup, the PostgreSQL database and local media uploads grow over time. Keep an eye on it.
## See Also
- [self-hosting-starter-guide](../docker/self-hosting-starter-guide.md)
- [docker-healthchecks](../docker/docker-healthchecks.md)

View File

@@ -0,0 +1,121 @@
---
title: "Updating n8n Running in Docker"
domain: selfhosting
category: services
tags: [n8n, docker, update, self-hosting, automation]
status: published
created: 2026-03-30
updated: 2026-03-30
---
# Updating n8n Running in Docker
n8n's in-app update notification checks against their npm release version, which often gets published before the `latest` Docker Hub tag is updated. This means you may see an update prompt in the UI even though `docker pull` reports the image as current. Pull a pinned version tag instead.
## Check Current vs Latest Version
```bash
# Check what's running
docker exec n8n-n8n-1 n8n --version
# Check what npm (n8n's upstream) says is latest
docker exec n8n-n8n-1 npm show n8n version
```
If the versions differ, the Docker Hub `latest` tag hasn't caught up yet. Use the pinned version tag.
## Get the Running Container's Config
Before stopping anything, capture the full environment so you can recreate the container identically:
```bash
docker inspect n8n-n8n-1 --format '{{json .Config.Env}}'
docker inspect n8n-n8n-1 --format '{{range .Mounts}}{{.Source}} -> {{.Destination}}{{println}}{{end}}'
```
For MajorsHouse, the relevant env vars are:
```
N8N_EDITOR_BASE_URL=https://n8n.majorshouse.com/
N8N_PORT=5678
TZ=America/New_York
N8N_TRUST_PROXY=true
GENERIC_TIMEZONE=America/New_York
N8N_HOST=n8n.majorshouse.com
N8N_PROTOCOL=https
WEBHOOK_URL=https://n8n.majorshouse.com/
```
Data volume: `n8n_n8n_data:/home/node/.n8n`
## Perform the Update
```bash
# 1. Pull the specific version (replace 2.14.2 with target version)
docker pull docker.n8n.io/n8nio/n8n:2.14.2
# 2. Stop and remove the old container
docker stop n8n-n8n-1 && docker rm n8n-n8n-1
# 3. Start fresh with the new image and same settings
docker run -d \
--name n8n-n8n-1 \
--restart unless-stopped \
-p 127.0.0.1:5678:5678 \
-v n8n_n8n_data:/home/node/.n8n \
-e N8N_EDITOR_BASE_URL=https://n8n.majorshouse.com/ \
-e N8N_PORT=5678 \
-e TZ=America/New_York \
-e N8N_TRUST_PROXY=true \
-e GENERIC_TIMEZONE=America/New_York \
-e N8N_HOST=n8n.majorshouse.com \
-e N8N_PROTOCOL=https \
-e WEBHOOK_URL=https://n8n.majorshouse.com/ \
docker.n8n.io/n8nio/n8n:2.14.2
# 4. Verify
docker exec n8n-n8n-1 n8n --version
docker ps --filter name=n8n-n8n-1 --format '{{.Status}}'
```
No restart of Caddy or other services required. Workflows, credentials, and execution history are preserved in the data volume.
## Reset a Forgotten Admin Password
n8n uses SQLite at `/home/node/.n8n/database.sqlite` (mapped to `n8n_n8n_data` on the host). Use Python to generate a valid bcrypt hash and update it directly — do **not** use shell variable interpolation, as `$` characters in bcrypt hashes will be eaten.
```bash
python3 -c "
import bcrypt, sqlite3
pw = b'your-new-password'
h = bcrypt.hashpw(pw, bcrypt.gensalt(rounds=10)).decode()
db = sqlite3.connect('/var/lib/docker/volumes/n8n_n8n_data/_data/database.sqlite')
db.execute(\"UPDATE user SET password=? WHERE email='marcus@majorshouse.com'\", (h,))
db.commit()
db.close()
db2 = sqlite3.connect('/var/lib/docker/volumes/n8n_n8n_data/_data/database.sqlite')
row = db2.execute(\"SELECT password FROM user WHERE email='marcus@majorshouse.com'\").fetchone()
print('Valid:', bcrypt.checkpw(pw, row[0].encode()))
"
```
`Valid: True` confirms the hash is correct. No container restart needed.
## Why Arcane Doesn't Always Catch It
[Arcane](https://getarcaneapp.com) watches Docker Hub for image digest changes. When n8n publishes a new release, there's often a delay before the `latest` tag on Docker Hub is updated to match. During that window:
- n8n's in-app updater (checks npm) reports an update available
- `docker pull latest` and Arcane both report the image as current
Once Docker Hub catches up, Arcane will notify normally. For immediate updates, use pinned version tags as shown above.
## Troubleshooting
**Password still rejected after update:** Shell variable interpolation (`$2b`, `$10`, etc.) silently truncates bcrypt hashes when passed as inline SQL strings. Always use the Python script approach above.
**Container exits immediately after recreate:** Check `docker logs n8n-n8n-1`. Most commonly a missing env var or a volume permission issue.
**Webhooks not firing after update:** Verify `N8N_TRUST_PROXY=true` is set. Without it, Caddy's `X-Forwarded-For` header causes n8n's rate limiter to drop webhook requests before parsing the body.
**`npm show n8n version` returns old version:** npm registry cache inside the container. Run `docker exec n8n-n8n-1 npm show n8n version --no-cache` to force a fresh check.

View File

@@ -148,6 +148,29 @@ WantedBy=timers.target
sudo systemctl enable --now rsync-backup.timer
```
## Cold Storage — AWS Glacier Deep Archive
rsync handles local and remote backups, but for true offsite cold storage — disaster recovery, archival copies you rarely need to retrieve — AWS Glacier Deep Archive is the cheapest option at ~$1/TB/month.
Upload files directly to an S3 bucket with the `DEEP_ARCHIVE` storage class:
```bash
# Single file
aws s3 cp backup.tar.gz s3://your-bucket/ --storage-class DEEP_ARCHIVE
# Entire directory
aws s3 sync /backup/offsite/ s3://your-bucket/offsite/ --storage-class DEEP_ARCHIVE
```
**When to use it:** Long-term backups you'd only need in a disaster scenario — media archives, yearly snapshots, irreplaceable data. Not for anything you'd need to restore quickly.
**Retrieval tradeoffs:**
- **Standard retrieval:** 12 hours, cheapest restore cost
- **Bulk retrieval:** Up to 48 hours, even cheaper
- **Expedited:** Not available for Deep Archive — if you need faster access, use regular Glacier or S3 Infrequent Access
**In the MajorsHouse backup strategy**, rsync handles the daily local and cross-host backups. Glacier Deep Archive is the final tier — offsite, durable, cheap, and slow to retrieve by design. A good backup plan has both.
## Gotchas & Notes
- **Test with `--dry-run` first.** Especially when using `--delete`. See what would be removed before actually removing it.
@@ -158,5 +181,5 @@ sudo systemctl enable --now rsync-backup.timer
## See Also
- [[self-hosting-starter-guide]]
- [[bash-scripting-patterns]]
- [self-hosting-starter-guide](../docker/self-hosting-starter-guide.md)
- [bash-scripting-patterns](../../01-linux/shell-scripting/bash-scripting-patterns.md)

View File

@@ -0,0 +1,94 @@
---
title: "FreshRSS — Self-Hosted RSS Reader"
domain: opensource
category: alternatives
tags: [freshrss, rss, self-hosting, docker, privacy]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# FreshRSS — Self-Hosted RSS Reader
## Problem
RSS is the best way to follow websites, blogs, and podcasts without algorithmic feeds, engagement bait, or data harvesting. But hosted RSS services like Feedly gate features behind subscriptions and still have access to your reading habits. Google killed Google Reader in 2013 and has been trying to kill RSS ever since.
## Solution
[FreshRSS](https://freshrss.org) is a self-hosted RSS aggregator. It fetches and stores your feeds on your own server, presents a clean reading interface, and syncs with mobile apps via standard APIs (Fever, Google Reader, Nextcloud News). No subscription, no tracking, no feed limits.
---
## Deployment (Docker)
```yaml
services:
freshrss:
image: freshrss/freshrss:latest
container_name: freshrss
restart: unless-stopped
ports:
- "8086:80"
volumes:
- ./freshrss/data:/var/www/FreshRSS/data
- ./freshrss/extensions:/var/www/FreshRSS/extensions
environment:
- TZ=America/New_York
- CRON_MIN=*/15 # fetch feeds every 15 minutes
```
### Caddy reverse proxy
```
rss.yourdomain.com {
reverse_proxy localhost:8086
}
```
---
## Initial Setup
1. Browse to your FreshRSS URL and run through the setup wizard
2. Create an admin account
3. Go to **Settings → Authentication** — enable API access if you want mobile app sync
4. Start adding feeds under **Subscriptions → Add a feed**
---
## Mobile App Sync
FreshRSS exposes a Google Reader-compatible API that most RSS apps support:
| App | Platform | Protocol |
|---|---|---|
| NetNewsWire | iOS / macOS | Fever or GReader |
| Reeder | iOS / macOS | GReader |
| ReadYou | Android | GReader |
| FeedMe | Android | GReader / Fever |
**API URL format:** `https://rss.yourdomain.com/api/greader.php`
Enable the API in FreshRSS: **Settings → Authentication → Allow API access**
---
## Feed Auto-Refresh
The `CRON_MIN=*/15` environment variable runs feed fetching every 15 minutes inside the container. For more control, add a host-level cron job:
```bash
# Fetch all feeds every 10 minutes
*/10 * * * * docker exec freshrss php /var/www/FreshRSS/app/actualize_script.php
```
---
## Why RSS Over Social Media
- **You control the feed** — no algorithm decides what you see or in what order
- **No engagement optimization** — content ranked by publish date, not outrage potential
- **Portable** — OPML export lets you move your subscriptions to any reader
- **Works forever** — RSS has been around since 1999 and isn't going anywhere
---

View File

@@ -0,0 +1,100 @@
---
title: "Gitea — Self-Hosted Git"
domain: opensource
category: alternatives
tags: [gitea, git, self-hosting, docker, ci-cd]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Gitea — Self-Hosted Git
## Problem
GitHub is the default home for code, but it's a Microsoft-owned centralized service. Your repositories, commit history, issues, and CI/CD pipelines are all under someone else's control. For personal projects and private infrastructure, there's no reason to depend on it.
## Solution
[Gitea](https://gitea.com) is a lightweight, self-hosted Git service. It provides the full GitHub-style workflow — repositories, branches, pull requests, webhooks, and a web UI — in a single binary or Docker container that runs comfortably on low-spec hardware.
---
## Deployment (Docker)
```yaml
services:
gitea:
image: docker.gitea.com/gitea:latest
container_name: gitea
restart: unless-stopped
ports:
- "3002:3000"
- "222:22" # SSH git access
volumes:
- ./gitea:/data
environment:
- USER_UID=1000
- USER_GID=1000
- GITEA__database__DB_TYPE=sqlite3
```
SQLite is fine for personal use. For team use, swap in PostgreSQL or MySQL.
### Caddy reverse proxy
```
git.yourdomain.com {
reverse_proxy localhost:3002
}
```
---
## Initial Setup
1. Browse to your Gitea URL — the first-run wizard handles configuration
2. Set the server URL to your public domain
3. Create an admin account
4. Configure SSH access if you want `git@git.yourdomain.com` cloning
---
## Webhooks
Gitea's webhook system is how automated pipelines get triggered on push. Example use case — auto-deploy a MkDocs wiki on every push:
1. Go to repo → **Settings → Webhooks → Add Webhook**
2. Set the payload URL to your webhook endpoint (e.g. `https://notes.yourdomain.com/webhook`)
3. Set content type to `application/json`
4. Select **Push events**
The webhook fires on every `git push`, allowing the receiving server to pull and rebuild automatically. See [MajorWiki Setup & Pipeline](../../05-troubleshooting/majwiki-setup-and-pipeline.md) for a complete example.
---
## Migrating from GitHub
Gitea can mirror GitHub repos and import them directly:
```bash
# Clone from GitHub, push to Gitea
git clone --mirror https://github.com/user/repo.git
cd repo.git
git remote set-url origin https://git.yourdomain.com/user/repo.git
git push --mirror
```
Or use the Gitea web UI: **+ → New Migration → GitHub**
---
## Why Not Just Use GitHub?
For public open source — GitHub is fine, the network effects are real. For private infrastructure code, personal projects, and anything you'd rather not hand to Microsoft:
- Full control over your data and access
- No rate limits, no storage quotas on your own hardware
- Webhooks and integrations without paying for GitHub Actions minutes
- Works entirely over Tailscale — no public exposure required
---

View File

@@ -0,0 +1,93 @@
---
title: "SearXNG — Private Self-Hosted Search"
domain: opensource
category: alternatives
tags: [searxng, search, privacy, self-hosting, docker]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# SearXNG — Private Self-Hosted Search
## Problem
Every search query sent to Google, Bing, or DuckDuckGo is logged, profiled, and used to build an advertising model of you. Even "private" search engines are still third-party services with their own data retention policies.
## Solution
[SearXNG](https://github.com/searxng/searxng) is a self-hosted metasearch engine. It queries multiple search engines simultaneously on your behalf — without sending any identifying information — and aggregates the results. The search engines see a request from your server, not from you.
Your queries stay on your infrastructure.
---
## Deployment (Docker)
```yaml
services:
searxng:
image: searxng/searxng:latest
container_name: searxng
restart: unless-stopped
ports:
- "8090:8080"
volumes:
- ./searxng:/etc/searxng
environment:
- SEARXNG_BASE_URL=https://search.yourdomain.com/
```
SearXNG requires a `settings.yml` in the mounted config directory. Generate one from the default:
```bash
docker run --rm searxng/searxng cat /etc/searxng/settings.yml > ./searxng/settings.yml
```
Key settings to configure in `settings.yml`:
```yaml
server:
secret_key: "generate-a-random-string-here"
bind_address: "0.0.0.0"
search:
safe_search: 0
default_lang: "en"
engines:
# Enable/disable specific engines here
```
### Caddy reverse proxy
```
search.yourdomain.com {
reverse_proxy localhost:8090
}
```
---
## Using SearXNG as an AI Search Backend
SearXNG integrates directly with Open WebUI as a web search provider, giving your local AI access to current web results without any third-party API keys:
**Open WebUI → Settings → Web Search:**
- Enable web search
- Set provider to `searxng`
- Set URL to `http://searxng:8080` (internal Docker network) or your Tailscale/local address
This is how MajorTwin gets current web context — queries go through SearXNG, not Google.
---
## Why Not DuckDuckGo?
DDG is better than Google for privacy, but it's still a centralized third-party service. SearXNG:
- Runs on your own hardware
- Has no account, no cookies, no session tracking
- Lets you choose which upstream engines to use and weight
- Can be kept entirely off the public internet (Tailscale-only)
---

View File

@@ -0,0 +1,107 @@
---
title: "rsync — Fast, Resumable File Transfers"
domain: opensource
category: dev-tools
tags: [rsync, backup, file-transfer, linux, cli]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# rsync — Fast, Resumable File Transfers
## Problem
Copying large files or directory trees between drives or servers is slow, fragile, and unresumable with `cp`. A dropped connection or a single error means starting over. You also want to skip files that already exist at the destination without re-copying them.
## Solution
`rsync` is a file synchronization tool that only transfers what has changed, preserves metadata, and can resume interrupted transfers. It works locally and over SSH.
### Installation (Fedora)
```bash
sudo dnf install rsync
```
### Basic Local Copy
```bash
rsync -av /source/ /destination/
```
- `-a` — archive mode: preserves permissions, timestamps, symlinks, ownership
- `-v` — verbose: shows what's being transferred
**Trailing slash on source matters:**
- `/source/` — copy the *contents* of source into destination
- `/source` — copy the source *directory itself* into destination
### Resume an Interrupted Transfer
```bash
rsync -av --partial --progress /source/ /destination/
```
- `--partial` — keeps partially transferred files so they can be resumed
- `--progress` — shows per-file progress and speed
### Skip Already-Transferred Files
```bash
rsync -av --ignore-existing /source/ /destination/
```
Useful when restarting a migration — skips anything already at the destination regardless of timestamp comparison.
### Dry Run First
Always preview what rsync will do before committing:
```bash
rsync -av --dry-run /source/ /destination/
```
No files are moved. Output shows exactly what would happen.
### Transfer Over SSH
```bash
rsync -av -e ssh /source/ user@remotehost:/destination/
```
Or with a non-standard port:
```bash
rsync -av -e "ssh -p 2222" /source/ user@remotehost:/destination/
```
### Exclude Patterns
```bash
rsync -av --exclude='*.tmp' --exclude='.Trash*' /source/ /destination/
```
### Real-World Use
Migrating ~286 files from `/majorRAID` to `/majorstorage` during a RAID dissolution project:
```bash
rsync -av --partial --progress --ignore-existing \
/majorRAID/ /majorstorage/ \
2>&1 | tee /root/raid_migrate.log
```
Run inside a `tmux` or `screen` session so it survives SSH disconnects:
```bash
tmux new-session -d -s rsync-migrate \
"rsync -av --partial --progress /majorRAID/ /majorstorage/ | tee /root/raid_migrate.log"
```
### Check Progress on a Running Transfer
```bash
tail -f /root/raid_migrate.log
```
---

View File

@@ -0,0 +1,81 @@
---
title: "screen — Simple Persistent Terminal Sessions"
domain: opensource
category: dev-tools
tags: [screen, terminal, ssh, linux, cli]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# screen — Simple Persistent Terminal Sessions
## Problem
Same problem as tmux: SSH sessions die, jobs get killed, long-running tasks need to survive disconnects. screen is the older, simpler alternative to tmux — universally available and gets the job done with minimal setup.
## Solution
`screen` creates detachable terminal sessions. It's installed by default on many systems, making it useful when tmux isn't available.
### Installation (Fedora)
```bash
sudo dnf install screen
```
### Core Workflow
```bash
# Start a named session
screen -S mysession
# Detach (keeps running)
Ctrl+a, d
# List sessions
screen -list
# Reattach
screen -r mysession
# If session shows as "Attached" (stuck)
screen -d -r mysession
```
### Start a Background Job Directly
```bash
screen -dmS mysession bash -c "long-running-command 2>&1 | tee /root/output.log"
```
- `-d` — start detached
- `-m` — create new session even if already inside screen
- `-S` — name the session
### Capture Current Output Without Attaching
```bash
screen -S mysession -X hardcopy /tmp/screen_output.txt
cat /tmp/screen_output.txt
```
### Send a Command to a Running Session
```bash
screen -S mysession -X stuff "tail -f /root/output.log\n"
```
---
## screen vs tmux
| Feature | screen | tmux |
|---|---|---|
| Availability | Installed by default on most systems | Usually needs installing |
| Split panes | Basic (Ctrl+a, S) | Better (Ctrl+b, ") |
| Scripting | Limited | More capable |
| Config complexity | Simple | More options |
Use screen when it's already there or for quick throwaway sessions. Use tmux for anything more complex. See [tmux](tmux.md).
---

View File

@@ -0,0 +1,98 @@
---
title: "tmux — Persistent Terminal Sessions"
domain: opensource
category: dev-tools
tags: [tmux, terminal, ssh, multiplexer, linux]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# tmux — Persistent Terminal Sessions
## Problem
SSH sessions die when your connection drops, your laptop closes, or you walk away. Long-running jobs — storage migrations, file scans, downloads — get killed mid-run. You need a way to detach from a session, come back later, and pick up exactly where you left off.
## Solution
`tmux` is a terminal multiplexer. It runs sessions that persist independently of your SSH connection. You can detach, disconnect, reconnect from a different machine, and reattach to find everything still running.
### Installation (Fedora)
```bash
sudo dnf install tmux
```
### Core Workflow
```bash
# Start a named session
tmux new-session -s mysession
# Detach from a session (keeps it running)
Ctrl+b, d
# List running sessions
tmux ls
# Reattach to a session
tmux attach -t mysession
# Kill a session when done
tmux kill-session -t mysession
```
### Start a Background Job Directly
Skip the interactive session entirely — start a job in a new detached session in one command:
```bash
tmux new-session -d -s rmlint2 "rmlint /majorstorage// /mnt/usb// /majorRAID 2>&1 | tee /majorRAID/rmlint_scan2.log"
```
The job runs immediately in the background. Attach later to check progress:
```bash
tmux attach -t rmlint2
```
### Capture Output Without Attaching
Read the current state of a session without interrupting it:
```bash
tmux capture-pane -t rmlint2 -p
```
### Split Panes
Monitor multiple things in one terminal window:
```bash
# Horizontal split (top/bottom)
Ctrl+b, "
# Vertical split (left/right)
Ctrl+b, %
# Switch between panes
Ctrl+b, arrow keys
```
### Real-World Use
On **majorhome**, all long-running storage operations run inside named tmux sessions so they survive SSH disconnects:
```bash
tmux new-session -d -s rmlint2 "rmlint ..." # dedup scan
tmux new-session -d -s rsync-migrate "rsync ..." # file migration
tmux ls # check what's running
```
---
## tmux vs screen
Both work. tmux has better split-pane support and scripting. screen is simpler and more universally installed. I use both — tmux for new jobs, screen for legacy ones. See the [screen](screen.md) article for reference.
---

View File

@@ -0,0 +1,81 @@
---
title: "Ventoy — Multi-Boot USB Tool"
domain: opensource
category: dev-tools
tags: [ventoy, usb, boot, iso, linux, tools]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Ventoy — Multi-Boot USB Tool
Ventoy turns a USB drive into a multi-boot device. Drop ISO files onto the drive and boot directly from them — no need to flash a new image every time you want to try a different distro or run a recovery tool.
## What It Is
[Ventoy](https://www.ventoy.net/) creates a special partition layout on a USB drive. After the one-time install, you just copy ISO (or WIM, VHD, IMG) files to the drive. On boot, Ventoy presents a menu of every image on the drive and boots whichever one you pick.
No re-formatting. No Rufus. No balenaEtcher. Just drag and drop.
## Installation
### Linux
```bash
# Download the latest release
wget https://github.com/ventoy/Ventoy/releases/download/v1.1.05/ventoy-1.1.05-linux.tar.gz
# Extract
tar -xzf ventoy-1.1.05-linux.tar.gz
cd ventoy-1.1.05
# Install to USB drive (WARNING: this formats the drive)
sudo ./Ventoy2Disk.sh -i /dev/sdX
```
Replace `/dev/sdX` with your USB drive. Use `lsblk` to identify it — triple-check before running, this wipes the drive.
### Windows
Download the Windows package from the Ventoy releases page, run `Ventoy2Disk.exe`, select your USB drive, and click Install.
## Usage
After installation, the USB drive shows up as a regular FAT32/exFAT partition. Copy ISOs onto it:
```bash
# Copy ISOs to the drive
cp ~/Downloads/Fedora-43-x86_64.iso /mnt/ventoy/
cp ~/Downloads/ubuntu-24.04-desktop.iso /mnt/ventoy/
cp ~/Downloads/memtest86.iso /mnt/ventoy/
```
Boot from the USB. Ventoy's menu lists every ISO it finds. Select one and it boots directly.
## Updating Ventoy
When a new version comes out, update without losing your ISOs:
```bash
# Update mode (-u) preserves existing files
sudo ./Ventoy2Disk.sh -u /dev/sdX
```
## Why It's Useful
- **Distro testing:** Keep 5-10 distro ISOs on one stick. Boot into any of them without reflashing.
- **Recovery toolkit:** Carry GParted, Clonezilla, memtest86, and a live Linux on a single drive.
- **OS installation:** One USB for every machine you need to set up.
- **Persistence:** Ventoy supports persistent storage for some distros, so live sessions can save data across reboots.
## Gotchas & Notes
- **Secure Boot:** Ventoy supports Secure Boot but it requires enrolling a key on first boot. Follow the on-screen prompts.
- **exFAT for large ISOs:** The default FAT32 partition has a 4GB file size limit. Use exFAT if any of your ISOs exceed that (Windows ISOs often do). Ventoy supports both.
- **UEFI vs Legacy:** Ventoy handles both automatically. It detects the boot mode and presents the appropriate menu.
- **Some ISOs don't work.** Heavily customized or non-standard ISOs may fail to boot. Standard distro ISOs and common tools work reliably.
## See Also
- [linux-distro-guide-beginners](../../01-linux/distro-specific/linux-distro-guide-beginners.md)

22
03-opensource/index.md Normal file
View File

@@ -0,0 +1,22 @@
# 📂 Open Source & Alternatives
A curated collection of my favorite open-source tools and privacy-respecting alternatives to mainstream software.
## 🔄 Alternatives
- [SearXNG: Private Self-Hosted Search](alternatives/searxng.md)
- [FreshRSS: Self-Hosted RSS Reader](alternatives/freshrss.md)
- [Gitea: Self-Hosted Git](alternatives/gitea.md)
## 🚀 Productivity
- [rmlint: Duplicate File Scanning](productivity/rmlint-duplicate-scanning.md)
## 🛠️ Development Tools
- [tmux: Persistent Terminal Sessions](dev-tools/tmux.md)
- [screen: Simple Persistent Sessions](dev-tools/screen.md)
- [rsync: Fast, Resumable File Transfers](dev-tools/rsync.md)
## 🎨 Media & Creative
- [yt-dlp: Video Downloading](media-creative/yt-dlp.md)
## 🔐 Privacy & Security
- [Vaultwarden: Self-Hosted Password Manager](privacy-security/vaultwarden.md)

View File

@@ -0,0 +1,157 @@
---
title: "yt-dlp — Video Downloading"
domain: opensource
category: media-creative
tags: [yt-dlp, video, youtube, downloads, cli]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# yt-dlp — Video Downloading
## What It Is
`yt-dlp` is a feature-rich command-line video downloader, forked from youtube-dl with active maintenance and significantly better performance. It supports YouTube, Twitch, and hundreds of other sites.
---
## Installation
### Fedora
```bash
sudo dnf install yt-dlp
# or latest via pip:
sudo pip install yt-dlp --break-system-packages
```
### Update
```bash
sudo pip install -U yt-dlp --break-system-packages
# or if installed as standalone binary:
yt-dlp -U
```
Keep it current — YouTube pushes extractor changes frequently and old versions break.
---
## Basic Usage
```bash
# Download a single video (best quality)
yt-dlp https://www.youtube.com/watch?v=VIDEO_ID
# Download to a specific directory with title as filename
yt-dlp -o "/path/to/output/%(title)s.%(ext)s" URL
```
---
## Plex-Optimized Download
Download best quality and auto-convert to HEVC for Apple TV direct play:
```bash
yt-dlp URL
```
That's it — if your config is set up correctly (see Config File section below). The config handles format selection, output path, subtitles, and automatic AV1/VP9 → HEVC conversion.
> [!note] `bestvideo[ext=mp4]` caps at 1080p because YouTube only serves H.264 up to 1080p. Use `bestvideo+bestaudio` to get true 4K, then let the post-download hook convert AV1/VP9 to HEVC. See [Plex 4K Codec Compatibility](../../04-streaming/plex/plex-4k-codec-compatibility.md) for the full setup.
---
## Playlists and Channels
```bash
# Download a full playlist
yt-dlp -o "%(playlist_index)s - %(title)s.%(ext)s" PLAYLIST_URL
# Download only videos not already present
yt-dlp --download-archive archive.txt PLAYLIST_URL
```
`--download-archive` maintains a file of completed video IDs — re-running the command skips already-downloaded videos automatically.
---
## Format Selection
```bash
# List all available formats for a video
yt-dlp --list-formats URL
# Download best video + best audio, merge to mp4
yt-dlp -f 'bestvideo+bestaudio' --merge-output-format mp4 URL
# Download audio only (MP3)
yt-dlp -x --audio-format mp3 URL
```
---
## Config File
Persist your preferred flags so you don't repeat them every command:
```bash
mkdir -p ~/.config/yt-dlp
cat > ~/.config/yt-dlp/config << 'EOF'
--remote-components ejs:github
--format bestvideo+bestaudio
--merge-output-format mp4
--output /plex/plex/%(title)s.%(ext)s
--write-auto-subs
--embed-subs
--exec /usr/local/bin/yt-dlp-hevc-convert.sh {}
EOF
```
After this, a bare `yt-dlp URL` downloads best quality, saves to `/plex/plex/`, embeds subtitles, and auto-converts AV1/VP9 to HEVC. See [Plex 4K Codec Compatibility](../../04-streaming/plex/plex-4k-codec-compatibility.md) for the conversion hook setup.
---
## Running Long Downloads in the Background
For large downloads or playlists, run inside `screen` or `tmux` so they survive SSH disconnects:
```bash
screen -dmS yt-download bash -c \
"yt-dlp -o '/plex/plex/%(title)s.%(ext)s' PLAYLIST_URL 2>&1 | tee ~/yt-download.log"
# Check progress
screen -r yt-download
# or
tail -f ~/yt-download.log
```
---
## Subtitle Downloads
The config above handles subtitles automatically via `--write-auto-subs` and `--embed-subs`. For one-off downloads where you want explicit control over subtitle embedding alongside specific format selection:
```bash
yt-dlp -f 'bestvideo[vcodec^=avc]+bestaudio[ext=m4a]/bestvideo+bestaudio' \
--merge-output-format mp4 \
-o "/plex/plex/%(title)s.%(ext)s" \
--write-auto-subs --embed-subs URL
```
This forces H.264 video + M4A audio when available — useful when you want guaranteed Apple TV / Plex compatibility without running the HEVC conversion hook.
---
## Troubleshooting
For YouTube JS challenge errors, missing formats, and n-challenge failures on Fedora — see [yt-dlp YouTube JS Challenge Fix](../../05-troubleshooting/yt-dlp-fedora-js-challenge.md).
**YouTube player client errors:** If downloads fail with extractor errors, YouTube may have broken the default player client. Override it:
```bash
yt-dlp --extractor-args "youtube:player-client=default,-tv_simply" URL
```
This can also be added to your config file as a persistent workaround until yt-dlp pushes a fix upstream. Keep yt-dlp updated — these breakages get patched regularly.
---

View File

@@ -0,0 +1,100 @@
---
title: "Vaultwarden — Self-Hosted Password Manager"
domain: opensource
category: privacy-security
tags: [vaultwarden, bitwarden, passwords, self-hosting, docker]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Vaultwarden — Self-Hosted Password Manager
## Problem
Password managers are a necessity, but handing your credentials to a third-party cloud service is a trust problem. Bitwarden is open source and privacy-respecting, but if you're already running a homelab, there's no reason to depend on their servers.
## Solution
[Vaultwarden](https://github.com/dani-garcia/vaultwarden) is an unofficial, lightweight Bitwarden-compatible server written in Rust. It exposes the same API that all official Bitwarden clients speak — desktop apps, browser extensions, mobile apps — so you get the full Bitwarden UX pointed at your own hardware.
Your passwords never leave your network.
---
## Deployment (Docker + Caddy)
### docker-compose.yml
```yaml
services:
vaultwarden:
image: vaultwarden/server:latest
container_name: vaultwarden
restart: unless-stopped
environment:
- DOMAIN=https://vault.yourdomain.com
- SIGNUPS_ALLOWED=false # disable after creating your account
volumes:
- ./vw-data:/data
ports:
- "8080:80"
```
Start it:
```bash
sudo docker compose up -d
```
### Caddy reverse proxy
```
vault.yourdomain.com {
reverse_proxy localhost:8080
}
```
Caddy handles TLS automatically. No extra cert config needed.
---
## Initial Setup
1. Browse to `https://vault.yourdomain.com` and create your account
2. Set `SIGNUPS_ALLOWED=false` in the compose file and restart the container
3. Install any official Bitwarden client (browser extension, desktop, mobile)
4. In the client, set the **Server URL** to `https://vault.yourdomain.com` before logging in
That's it. The client has no idea it's not talking to Bitwarden's servers.
---
## Access Model
On MajorInfrastructure, Vaultwarden runs on **majorlab** and is accessible:
- **Internally** — via Caddy on the local network
- **Remotely** — via Tailscale; vault is reachable from any device on the tailnet without exposing it to the public internet
This means the Caddy vhost does not need to be publicly routable. You can choose to expose it publicly (Let's Encrypt works fine) or keep it Tailscale-only.
---
## Backup
Vaultwarden stores everything in a single SQLite database at `./vw-data/db.sqlite3`. Back it up like any file:
```bash
# Simple copy (stop container first for consistency, or use sqlite backup mode)
sqlite3 /path/to/vw-data/db.sqlite3 ".backup '/path/to/backup/vw-backup-$(date +%F).sqlite3'"
```
Or include the `vw-data/` directory in your regular rsync backup run.
---
## Why Not Bitwarden (Official)?
The official Bitwarden server is also open source but requires significantly more resources (multiple services, SQL Server). Vaultwarden runs in a single container on minimal RAM and handles everything a personal or family vault needs.
---

View File

@@ -0,0 +1,63 @@
---
title: "rmlint — Extreme Duplicate File Scanning"
domain: opensource
category: productivity
tags: [rmlint, duplicates, storage, cleanup, linux]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# rmlint — Extreme Duplicate File Scanning
## Problem
Over time, backups and media collections can accumulate massive amounts of duplicate data. Traditional duplicate finders are often slow and limited in how they handle results. On MajorRAID, I identified **~4.0 TB (113,584 files)** of duplicate data across three different storage points.
## Solution
`rmlint` is an extremely fast tool for finding (and optionally removing) duplicates. It is significantly faster than `fdupes` or `rdfind` because it uses a multi-stage approach to avoid unnecessary hashing.
### 1. Installation (Fedora)
```bash
sudo dnf install rmlint
```
### 2. Scanning Multiple Directories
To scan for duplicates across multiple mount points and compare them:
```bash
rmlint /majorstorage /majorRAID /mnt/usb
```
This will generate a script named `rmlint.sh` and a summary of the findings.
### 3. Reviewing Results
**DO NOT** run the generated script without reviewing it first. You can use the summary to see which paths contain the most duplicates:
```bash
# View the summary
cat rmlint.json | jq .
```
### 4. Advanced Usage: Finding Duplicates by Hash Only
If you suspect duplicates with different filenames:
```bash
rmlint --hidden --hard-links /path/to/search
```
### 5. Repurposing Storage
After scanning and clearing duplicates, you can reclaim significant space. In my case, this was the first step in repurposing a 12TB USB drive as a **SnapRAID parity drive**.
---
## Maintenance
Run a scan monthly or before any major storage consolidation project.
---

View File

@@ -5,3 +5,7 @@ Guides for live streaming and podcast production, with a focus on OBS Studio.
## OBS Studio
- [OBS Studio Setup & Encoding](obs/obs-studio-setup-encoding.md)
## Plex
- [Plex 4K Codec Compatibility (Apple TV)](plex/plex-4k-codec-compatibility.md)

View File

@@ -118,6 +118,31 @@ echo "v4l2loopback" | sudo tee /etc/modules-load.d/v4l2loopback.conf
echo "options v4l2loopback devices=1 video_nr=10 card_label=OBS Virtual Camera exclusive_caps=1" | sudo tee /etc/modprobe.d/v4l2loopback.conf
```
## Plugins & Capture Sources
### Captions Plugin (Accessibility)
[OBS Captions Plugin](https://github.com/ratwithacompiler/OBS-captions-plugin) adds real-time closed captions to streams using speech-to-text. Viewers can toggle captions on/off in their player — important for accessibility and for viewers watching without sound.
Install from the plugin's GitHub releases page, then configure in Tools → Captions.
### VLC Video Source (Capture Card)
For capturing from an Elgato 4K60 Pro MK.2 (or similar DirectShow capture card) via VLC as an OBS source, use this device string:
```
:dshow-vdev=Game Capture 4K60 Pro MK.2
:dshow-adev=Game Capture 4K60 Pro MK.2 Audio (Game Capture 4K60 Pro MK.2)
:dshow-aspect-ratio=16:9
:dshow-chroma=YUY2
:dshow-fps=0
:no-dshow-config
:no-dshow-tuner
:live-caching=0
```
Set `live-caching=0` to minimize capture latency. This is useful when OBS's native Game Capture isn't an option (e.g., capturing a separate machine's output through the card).
## Gotchas & Notes
- **Test your stream before going live.** Record a short clip and watch it back. Artifacts in the recording will be worse in the stream.
@@ -128,5 +153,5 @@ echo "options v4l2loopback devices=1 video_nr=10 card_label=OBS Virtual Camera e
## See Also
- [[linux-file-permissions]]
- [[bash-scripting-patterns]]
- [linux-file-permissions](../../01-linux/files-permissions/linux-file-permissions.md)
- [bash-scripting-patterns](../../01-linux/shell-scripting/bash-scripting-patterns.md)

View File

@@ -0,0 +1,157 @@
---
title: "Plex 4K Codec Compatibility (Apple TV)"
domain: streaming
category: plex
tags: [plex, 4k, hevc, apple-tv, transcoding, codec]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Plex 4K Codec Compatibility (Apple TV)
4K content on YouTube is delivered in AV1 or VP9 — neither of which the Plex app on Apple TV can direct play. This forces Plex to transcode, and most home server CPUs can't transcode 4K in real time. The fix is converting to HEVC before Plex ever sees the file.
## Codec Compatibility Matrix
| Codec | Apple TV (Plex direct play) | YouTube 4K | Notes |
|---|---|---|---|
| H.264 (AVC) | ✅ | ❌ (max 1080p) | Most compatible, but no 4K |
| HEVC (H.265) | ✅ | ❌ | Best choice: 4K compatible, widely supported |
| VP9 | ❌ | ✅ | Google's royalty-free codec, forces transcode |
| AV1 | ❌ | ✅ | Best compression, requires modern hardware to decode |
**Target format: HEVC.** Direct plays on Apple TV, supports 4K/HDR, and modern hardware can encode it quickly.
## Why AV1 and VP9 Cause Problems
When Plex can't direct play a file it transcodes it on the server. AV1 and VP9 decoding is CPU-intensive — most home server CPUs can't keep up with 4K60 in real time. Intel Quick Sync (HD 630 era) supports VP9 hardware decode but not AV1. AV1 hardware support requires 11th-gen Intel or RTX 30-series+.
## Batch Converting Existing Files
For files already in your Plex library, use this script to find all AV1/VP9 files and convert them to HEVC via VAAPI (Intel Quick Sync):
```bash
#!/bin/bash
VAAPI_DEV=/dev/dri/renderD128
PLEX_DIR="/plex/plex"
LOG="/root/av1_to_hevc.log"
TMPDIR="/tmp/av1_convert"
mkdir -p "$TMPDIR"
echo "=== AV1→HEVC batch started $(date) ===" | tee -a "$LOG"
find "$PLEX_DIR" -iname "*.mp4" -o -iname "*.mkv" | while IFS= read -r f; do
codec=$(mediainfo --Inform='Video;%Format%' "$f" 2>/dev/null)
[ "$codec" != "AV1" ] && [ "$codec" != "VP9" ] && continue
echo "[$(date +%H:%M:%S)] Converting: $(basename "$f")" | tee -a "$LOG"
tmp="${TMPDIR}/$(basename "${f%.*}").mp4"
ffmpeg -hide_banner -loglevel error \
-vaapi_device "$VAAPI_DEV" \
-i "$f" \
-vf 'format=nv12,hwupload' \
-c:v hevc_vaapi \
-qp 22 \
-c:a copy \
-movflags +faststart \
"$tmp"
if [ $? -eq 0 ] && [ -s "$tmp" ]; then
mv "$tmp" "${f%.*}_hevc.mp4"
rm -f "$f"
else
rm -f "$tmp"
echo " FAILED — original kept." | tee -a "$LOG"
fi
done
```
Run in a tmux session so it survives SSH disconnect:
```bash
tmux new-session -d -s av1-convert '/root/av1_to_hevc.sh'
tail -f /root/av1_to_hevc.log
```
After completion, trigger a Plex library scan to pick up the renamed files.
## Automating Future Downloads (yt-dlp)
Prevent the problem at the source with a post-download conversion hook.
### 1. Create the conversion script
Save to `/usr/local/bin/yt-dlp-hevc-convert.sh`:
```bash
#!/bin/bash
INPUT="$1"
VAAPI_DEV=/dev/dri/renderD128
LOG=/var/log/yt-dlp-convert.log
[ -z "$INPUT" ] && exit 0
[ ! -f "$INPUT" ] && exit 0
CODEC=$(mediainfo --Inform='Video;%Format%' "$INPUT" 2>/dev/null)
if [ "$CODEC" != "AV1" ] && [ "$CODEC" != "VP9" ]; then
exit 0
fi
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Converting ($CODEC): $(basename "$INPUT")" >> "$LOG"
TMPOUT="${INPUT%.*}_hevc_tmp.mp4"
ffmpeg -hide_banner -loglevel error \
-vaapi_device "$VAAPI_DEV" \
-i "$INPUT" \
-vf 'format=nv12,hwupload' \
-c:v hevc_vaapi \
-qp 22 \
-c:a copy \
-movflags +faststart \
"$TMPOUT"
if [ $? -eq 0 ] && [ -s "$TMPOUT" ]; then
mv "$TMPOUT" "${INPUT%.*}.mp4"
[ "${INPUT%.*}.mp4" != "$INPUT" ] && rm -f "$INPUT"
echo "[$(date '+%Y-%m-%d %H:%M:%S')] OK: $(basename "${INPUT%.*}.mp4")" >> "$LOG"
else
rm -f "$TMPOUT"
echo "[$(date '+%Y-%m-%d %H:%M:%S')] FAILED — original kept: $(basename "$INPUT")" >> "$LOG"
fi
```
```bash
chmod +x /usr/local/bin/yt-dlp-hevc-convert.sh
```
### 2. Configure yt-dlp
`~/.config/yt-dlp/config`:
```
--remote-components ejs:github
--format bestvideo+bestaudio
--merge-output-format mp4
--output /plex/plex/%(title)s.%(ext)s
--write-auto-subs
--embed-subs
--exec /usr/local/bin/yt-dlp-hevc-convert.sh {}
```
With this config, `yt-dlp <URL>` downloads the best available quality (including 4K AV1/VP9), then immediately converts any AV1 or VP9 output to HEVC before Plex indexes it.
> [!note] The `--format bestvideo+bestaudio` selector gets true 4K from YouTube (served as AV1 or VP9). The hook converts it to HEVC. Without the hook, using `bestvideo[ext=mp4]` would cap downloads at 1080p since YouTube only serves H.264 up to 1080p.
## Enabling Hardware Transcoding in Plex
Even with automatic conversion in place, enable hardware acceleration in Plex as a fallback for any files that slip through:
**Plex Web → Settings → Transcoder → "Use hardware acceleration when available"**
This requires Plex Pass. On Intel systems with Quick Sync, VP9 will hardware transcode even without pre-conversion. AV1 will still fall back to CPU on pre-Alder Lake hardware.
## Related
- [yt-dlp: Video Downloading](../../03-opensource/media-creative/yt-dlp.md)
- [OBS Studio Setup & Encoding](../obs/obs-studio-setup-encoding.md)

View File

@@ -0,0 +1,72 @@
---
title: Ansible SSH Timeout During dnf upgrade on Fedora Hosts
domain: troubleshooting
category: ansible
tags:
- ansible
- ssh
- fedora
- dnf
- timeout
- fleet-management
status: published
created: '2026-03-28'
updated: '2026-03-28'
---
# Ansible SSH Timeout During dnf upgrade on Fedora Hosts
## Symptom
Running `ansible-playbook update.yml` against Fedora/CentOS hosts fails with:
```
fatal: [hostname]: UNREACHABLE! => {"changed": false,
"msg": "Failed to connect to the host via ssh: Shared connection to <IP> closed."}
```
The failure occurs specifically during `ansible.builtin.dnf` tasks that upgrade all packages (`name: '*'`, `state: latest`), because the operation takes long enough for the SSH connection to drop.
## Root Cause
Without explicit SSH keepalive settings in `ansible.cfg`, OpenSSH defaults apply. Long-running tasks like full `dnf upgrade` across a fleet can exceed idle timeouts, causing the control connection to close mid-task.
## Fix
Add a `[ssh_connection]` section to `ansible.cfg`:
```ini
[ssh_connection]
ssh_args = -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -o ControlMaster=auto -o ControlPersist=60s
```
| Setting | Purpose |
|---------|---------|
| `ServerAliveInterval=30` | Send a keepalive every 30 seconds |
| `ServerAliveCountMax=10` | Allow 10 missed keepalives before disconnect (~5 min tolerance) |
| `ControlMaster=auto` | Reuse SSH connections across tasks |
| `ControlPersist=60s` | Keep the master connection open 60s after last use |
## Related Fix: do-agent Task Guard
In the same playbook run, a second failure surfaced on hosts where the `ansible.builtin.uri` task to fetch the latest `do-agent` release was **skipped** (non-RedHat hosts or hosts without do-agent installed). The registered variable existed but contained a skipped result with no `.json` attribute, causing:
```
object of type 'dict' has no attribute 'json'
```
Fix: add guards to downstream tasks that reference the URI result:
```yaml
when:
- do_agent_release is defined
- do_agent_release is not skipped
- do_agent_release.json is defined
```
## Environment
- **Controller:** macOS (MajorAir)
- **Targets:** Fedora 43 (majorlab, majormail, majorhome, majordiscord)
- **Ansible:** community edition via Homebrew
- **Committed:** `d9c6bdb` in MajorAnsible repo

View File

@@ -0,0 +1,68 @@
---
title: "Ansible: Vault Password File Not Found"
domain: troubleshooting
category: general
tags: [ansible, vault, credentials, configuration]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Ansible: Vault Password File Not Found
## Error
```
[WARNING]: Error getting vault password file (default): The vault password file /Users/majorlinux/.ansible/vault_pass was not found
[ERROR]: The vault password file /Users/majorlinux/.ansible/vault_pass was not found
```
## Cause
Ansible is configured to look for a vault password file at `~/.ansible/vault_pass`, but the file does not exist. This is typically set in `ansible.cfg` via the `vault_password_file` directive.
## Solutions
### Option 1: Remove the vault config (if you're not using Vault)
Check your `ansible.cfg` for this line and remove it if Vault is not needed:
```ini
[defaults]
vault_password_file = ~/.ansible/vault_pass
```
### Option 2: Create the vault password file
```bash
echo 'your_vault_password' > ~/.ansible/vault_pass
chmod 600 ~/.ansible/vault_pass
```
> **Security note:** Keep permissions tight (`600`) so only your user can read the file. The actual vault password is stored in Bitwarden under the "Ansible Vault Password" entry.
### Option 3: Pass the password at runtime (no file needed)
```bash
ansible-playbook test.yml --ask-vault-pass
```
## Diagnosing the Source of the Config
To find which config file is setting `vault_password_file`, run:
```bash
ansible-config dump --only-changed
```
This shows all non-default config values and their source files. Config is loaded in this order of precedence:
1. `ANSIBLE_CONFIG` environment variable
2. `./ansible.cfg` (current directory)
3. `~/.ansible.cfg`
4. `/etc/ansible/ansible.cfg`
## Related
- [Ansible Getting Started](../01-linux/shell-scripting/ansible-getting-started.md)
- Vault password is stored in Bitwarden under **"Ansible Vault Password"**
- Ansible playbooks live at `~/MajorAnsible` on MajorAir/MajorMac

View File

@@ -0,0 +1,89 @@
---
title: "Ansible Ignores ansible.cfg on WSL2 Windows Mounts"
domain: troubleshooting
category: ansible
tags: [ansible, wsl, wsl2, windows, vault, configuration]
status: published
created: 2026-04-03
updated: 2026-04-03
---
# Ansible Ignores ansible.cfg on WSL2 Windows Mounts
## Problem
Running Ansible from a repo on a Windows drive (`/mnt/c/`, `/mnt/d/`, etc.) in WSL2 silently ignores the local `ansible.cfg`. You'll see:
```
[WARNING]: Ansible is being run in a world writable directory
(/mnt/d/MajorAnsible), ignoring it as an ansible.cfg source.
```
This causes vault decryption to fail (`Attempting to decrypt but no vault secrets found`), inventory to fall back to `/etc/ansible/hosts`, and `remote_user` to reset to defaults — even though `ansible.cfg` is right there in the project directory.
## Cause
WSL2 mounts Windows NTFS drives with broad permissions (typically `0777`). Ansible refuses to load `ansible.cfg` from any world-writable directory as a security measure — a malicious user on a shared system could inject a rogue config.
This is hardcoded behavior in Ansible and cannot be overridden with a flag.
## Solutions
### Option 1: Environment Variables (Recommended)
Export the settings that `ansible.cfg` would normally provide. Add to `~/.bashrc`:
```bash
export ANSIBLE_VAULT_PASSWORD_FILE=~/.ansible/vault_pass
```
Other common settings you may need:
```bash
export ANSIBLE_REMOTE_USER=root
export ANSIBLE_INVENTORY=/mnt/d/MajorAnsible/inventory/inventory.yml
```
### Option 2: Pass Flags Explicitly
```bash
ansible-playbook -i inventory/ playbook.yml --vault-password-file ~/.ansible/vault_pass
```
This works but is tedious for daily use.
### Option 3: Clone to a Native Linux Path
Clone the repo inside the WSL2 filesystem instead of on the Windows mount:
```bash
git clone https://git.example.com/repo.git ~/MajorAnsible
```
Native WSL2 paths (`/home/user/...`) have proper Linux permissions, so `ansible.cfg` loads normally. The tradeoff is that Windows tools can't easily access the repo.
### Option 4: Fix Mount Permissions (Not Recommended)
You can change WSL2 mount permissions via `/etc/wsl.conf`:
```ini
[automount]
options = "metadata,umask=022"
```
This requires a `wsl --shutdown` and remount. It may break other Windows-Linux interop workflows and affects all mounted drives.
## Diagnosis
To confirm whether Ansible is loading your config:
```bash
ansible --version
```
Look for the `config file` line. If it shows `None` instead of your project's `ansible.cfg`, the config is being ignored.
## Related
- [Ansible: Vault Password File Not Found](ansible-vault-password-file-missing.md) — general vault password troubleshooting
- [Ansible Docs: Avoiding Security Risks with ansible.cfg](https://docs.ansible.com/ansible/latest/reference_appendices/config.html#cfg-in-world-writable-dir)

View File

@@ -1,3 +1,12 @@
---
title: "Docker & Caddy Recovery After Reboot (Fedora + SELinux)"
domain: troubleshooting
category: general
tags: [docker, caddy, selinux, fedora, reboot, majorlab]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Docker & Caddy Recovery After Reboot (Fedora + SELinux)
## 🛑 Problem

View File

@@ -0,0 +1,84 @@
---
title: "n8n Behind Reverse Proxy: X-Forwarded-For Trust Fix"
domain: troubleshooting
category: docker
tags: [n8n, caddy, reverse-proxy, docker, express]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# n8n Behind Reverse Proxy: X-Forwarded-For Trust Fix
## The Problem
When running n8n behind a reverse proxy (Caddy, Nginx, Traefik), the logs fill with:
```
ValidationError: The 'X-Forwarded-For' header is set but the Express 'trust proxy' setting is false (default).
This could indicate a misconfiguration which would prevent express-rate-limit from accurately identifying users.
```
This means n8n's Express rate limiter sees every request as coming from the proxy's internal IP, not the real client. Rate limiting and audit logging both break.
## Why `N8N_TRUST_PROXY=true` Isn't Enough
Older n8n versions accepted `N8N_TRUST_PROXY=true` to trust proxy headers. Newer versions (1.x+) use Express's `trust proxy` setting, which requires knowing *how many* proxy hops to trust. Without `N8N_PROXY_HOPS`, Express ignores the `X-Forwarded-For` header entirely even if `N8N_TRUST_PROXY=true` is set.
## The Fix
Add `N8N_PROXY_HOPS=1` to your n8n environment:
### Docker Compose
```yaml
services:
n8n:
image: docker.n8n.io/n8nio/n8n:latest
environment:
- N8N_HOST=n8n.example.com
- N8N_PROTOCOL=https
- N8N_TRUST_PROXY=true
- N8N_PROXY_HOPS=1 # <-- Add this
```
Set `N8N_PROXY_HOPS` to the number of reverse proxies between the client and n8n:
- **1** — single proxy (Caddy/Nginx directly in front of n8n)
- **2** — two proxies (e.g., Cloudflare → Caddy → n8n)
### Recreate the Container
```bash
cd /opt/n8n # or wherever your compose file lives
docker compose down
docker compose up -d
```
If you get a container name conflict:
```bash
docker rm -f n8n-n8n-1
docker compose up -d
```
## Verifying the Fix
Check the logs after restart:
```bash
docker logs --since 5m n8n-n8n-1 2>&1 | grep -i "forwarded\|proxy\|ValidationError"
```
If the fix worked, there should be zero `ValidationError` lines. A clean startup looks like:
```
n8n ready on ::, port 5678
Version: 2.14.2
Editor is now accessible via:
https://n8n.example.com
```
## Key Notes
- Keep `N8N_TRUST_PROXY=true` alongside `N8N_PROXY_HOPS` — both are needed.
- The `mount of type volume should not define bind option` warning from Docker Compose when using `:z` (SELinux) volume labels is cosmetic and can be ignored.
- If n8n reports "Last session crashed" after a `docker rm -f` recreation, this is expected — the old container was force-killed, so n8n sees it as a crash. It recovers automatically.

View File

@@ -0,0 +1,82 @@
---
title: "Nextcloud AIO Container Unhealthy for 20 Hours After Nightly Update"
domain: troubleshooting
category: docker
tags: [nextcloud, docker, healthcheck, netdata, php-fpm, aio]
status: published
created: 2026-03-28
updated: 2026-03-28
---
# Nextcloud AIO Container Unhealthy for 20 Hours After Nightly Update
## Symptom
Netdata alert `docker_nextcloud_unhealthy` fired on majorlab and stayed in Warning for 20 hours. The `nextcloud-aio-nextcloud` container was running but its Docker healthcheck kept failing. No user-facing errors were visible in `nextcloud.log`.
## Investigation
### Timeline (2026-03-27, all UTC)
| Time | Event |
|---|---|
| 04:00 | Nightly backup script started, mastercontainer update kicked off |
| 04:03 | `nextcloud-aio-nextcloud` container recreated |
| 04:05 | Backup finished |
| 07:25 | Mastercontainer logged "Initial startup of Nextcloud All-in-One complete!" (3h20m delay) |
| 10:22 | First entry in `nextcloud.log` (deprecation warnings only — no errors) |
| 04:00 (Mar 28) | Next nightly backup replaced the container; new container came up healthy in ~25 minutes |
### Key findings
- **No image update** — the container image dated to Feb 26, so this was not caused by a version change.
- **No app-level errors** — `nextcloud.log` contained only `files_rightclick` deprecation warnings (level 3). No level 2/4 entries.
- **PHP-FPM never stabilized** — the healthcheck (`/healthcheck.sh`) tests `nc -z 127.0.0.1 9000` (PHP-FPM). The container was running but FPM wasn't responding to the port check.
- **6-hour log gap** — no `nextcloud.log` entries between container start (04:03) and first log (10:22), suggesting the AIO init scripts (occ upgrade, app updates, cron jobs) ran for hours before the app became partially responsive.
- **RestartCount: 0** — the container never restarted on its own. It sat there unhealthy for the full 20 hours.
- **Disk space fine** — 40% used on `/`.
### Healthcheck details
```bash
#!/bin/bash
# /healthcheck.sh inside nextcloud-aio-nextcloud
nc -z "$POSTGRES_HOST" "$POSTGRES_PORT" || exit 0 # postgres down = pass (graceful)
nc -z 127.0.0.1 9000 || exit 1 # PHP-FPM down = fail
```
If PostgreSQL is unreachable, the check passes (exits 0). The only failure path is PHP-FPM not listening on port 9000.
## Root Cause
The AIO nightly update cycle recreated the container, but the startup/migration process hung or ran extremely long, preventing PHP-FPM from fully initializing. The container sat in this state for 20 hours with no self-recovery mechanism until the next nightly cycle replaced it.
The exact migration or occ command that stalled could not be confirmed — the old container's entrypoint logs were lost when the Mar 28 backup cycle replaced it.
## Fix
Two changes deployed on 2026-03-28:
### 1. Dedicated Netdata alarm with lenient window
Split `nextcloud-aio-nextcloud` into its own Netdata alarm (`docker_nextcloud_unhealthy`) with a 10-minute lookup and 10-minute delay, separate from the general container alarm. See [Tuning Netdata Docker Health Alarms](../../02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md).
### 2. Watchdog cron for auto-restart
Deployed `/etc/cron.d/nextcloud-health-watchdog` on majorlab:
```bash
*/15 * * * * root docker inspect --format={{.State.Health.Status}} nextcloud-aio-nextcloud 2>/dev/null | grep -q unhealthy && [ "$(docker inspect --format={{.State.StartedAt}} nextcloud-aio-nextcloud | xargs -I{} date -d {} +\%s)" -lt "$(date -d "1 hour ago" +\%s)" ] && docker restart nextcloud-aio-nextcloud && logger -t nextcloud-watchdog "Restarted unhealthy nextcloud-aio-nextcloud"
```
- Checks every 15 minutes
- Only restarts if the container has been running >1 hour (avoids interfering with normal startup)
- Logs to syslog: `journalctl -t nextcloud-watchdog`
This caps future unhealthy outages at ~1 hour instead of persisting until the next nightly cycle.
## See Also
- [Tuning Netdata Docker Health Alarms](../../02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md)
- [Debugging Broken Docker Containers](../../02-selfhosting/docker/debugging-broken-docker-containers.md)
- [Docker Healthchecks](../../02-selfhosting/docker/docker-healthchecks.md)

View File

@@ -0,0 +1,141 @@
---
title: "Fedora Networking & Kernel Troubleshooting"
domain: troubleshooting
category: networking
tags: [fedora, networking, kernel, grub, nmcli, troubleshooting]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Fedora Networking & Kernel Troubleshooting
Two common issues on the MajorsHouse Fedora fleet (majorlab, majorhome): network connectivity dropping after updates or reboots, and kernel upgrades that break things. These are the quick fixes and the deeper recovery paths.
## Networking Drops After Reboot or Update
### Quick Fix
If a Fedora box loses network connectivity after a reboot or `dnf upgrade`, NetworkManager may not have brought the connection back up automatically:
```bash
nmcli connection up "Wired connection 1"
```
This re-activates the default wired connection. If the connection name differs on your system:
```bash
# List all known connections
nmcli connection show
# Bring up by name
nmcli connection up "your-connection-name"
```
### Why This Happens
- NetworkManager may not auto-activate a connection if it was configured as manual or if the profile was reset during an upgrade.
- Kernel updates can temporarily break network drivers, especially on hardware with out-of-tree modules. The new kernel loads, the old driver doesn't match, and the NIC doesn't come up.
- On headless servers (like majorlab and majorhome), there's no desktop network applet to reconnect — it stays down until you fix it via console or IPMI.
### Make It Persistent
Ensure the connection auto-activates on boot:
```bash
# Check current autoconnect setting
nmcli connection show "Wired connection 1" | grep autoconnect
# Enable if not set
nmcli connection modify "Wired connection 1" connection.autoconnect yes
```
## Kernel Issues — Booting an Older Kernel
When a new kernel causes problems (network, storage, GPU, or boot failures), boot into the previous working kernel via GRUB.
### At the GRUB Menu
1. Reboot the machine.
2. Hold **Shift** (BIOS) or press **Esc** (UEFI) to show the GRUB menu.
3. Select **Advanced options** or an older kernel entry.
4. Boot into the working kernel.
### From the Command Line (Headless)
If you have console access but no GRUB menu:
```bash
# List installed kernels
sudo grubby --info=ALL | grep -E "^(index|kernel|title)"
# Set the previous kernel as default (by index)
sudo grubby --set-default-index=1
# Or set by kernel path
sudo grubby --set-default=/boot/vmlinuz-6.19.9-200.fc43.x86_64
# Reboot into it
sudo reboot
```
### Remove a Bad Kernel
Once you've confirmed the older kernel works:
```bash
# Remove the broken kernel
sudo dnf remove kernel-core-6.19.10-200.fc43.x86_64
# Verify GRUB updated
sudo grubby --default-kernel
```
### Prevent Auto-Updates From Reinstalling It
If the same kernel version keeps coming back and keeps breaking:
```bash
# Temporarily exclude it from updates
sudo dnf upgrade --exclude=kernel*
# Or pin in dnf.conf
echo "excludepkgs=kernel*" | sudo tee -a /etc/dnf/dnf.conf
```
Remove the exclusion once a fixed kernel version is released.
## Quick Diagnostic Commands
```bash
# Check current kernel
uname -r
# Check network status
nmcli general status
nmcli device status
ip addr show
# Check if NetworkManager is running
systemctl status NetworkManager
# Check recent kernel/network errors
journalctl -b -p err | grep -iE "kernel|network|eth|ens|nm"
# Check which kernels are installed
rpm -qa kernel-core | sort -V
```
## Gotchas & Notes
- **Always have console access** (IPMI, physical KVM, or Proxmox console) for headless servers before doing kernel updates. If the new kernel breaks networking, SSH won't save you.
- **Fedora keeps 3 kernels by default** (`installonly_limit=3` in `/etc/dnf/dnf.conf`). If you need more fallback options, increase this number before upgrading.
- **Test kernel updates on one server first.** Update majorlab, confirm it survives a reboot, then update majorhome.
- **`grubby` is Fedora's preferred tool** for managing GRUB entries. Avoid editing `grub.cfg` directly.
Reference: [Fedora — Working with the GRUB 2 Boot Loader](https://docs.fedoraproject.org/en-US/fedora/latest/system-administrators-guide/kernel-module-driver-configuration/Working_with_the_GRUB_2_Boot_Loader/)
## See Also
- [docker-caddy-selinux-post-reboot-recovery](docker-caddy-selinux-post-reboot-recovery.md)
- [managing-linux-services-systemd-ansible](../01-linux/process-management/managing-linux-services-systemd-ansible.md)

View File

@@ -0,0 +1,56 @@
---
title: "Gemini CLI: Manual Update Guide"
domain: troubleshooting
category: general
tags: [gemini, cli, npm, update, google]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# 🛠️ Gemini CLI: Manual Update Guide
If the automatic update fails or you need to force a specific version of the Gemini CLI, use these steps.
## 🔴 Symptom: Automatic Update Failed
You may see an error message like:
`✕ Automatic update failed. Please try updating manually`
## 🟢 Manual Update Procedure
### 1. Verify Current Version
Check the version currently installed on your system:
```bash
gemini --version
```
### 2. Check Latest Version
Query the npm registry for the latest available version:
```bash
npm show @google/gemini-cli version
```
### 3. Perform Manual Update
Use `npm` with `sudo` to update the global package:
```bash
sudo npm install -g @google/gemini-cli@latest
```
### 4. Confirm Update
Verify that the new version is active:
```bash
gemini --version
```
## 🛠️ Troubleshooting Update Failures
### Permissions Issues
If you encounter `EACCES` errors without `sudo`, ensure your user has permissions or use `sudo` as shown above.
### Registry Connectivity
If `npm` cannot reach the registry, check your internet connection or any local firewall/proxy settings.
### Cache Issues
If the version doesn't update, try clearing the npm cache:
```bash
npm cache clean --force
```

View File

@@ -0,0 +1,93 @@
---
title: "Gitea Actions Runner: Boot Race Condition Fix"
domain: troubleshooting
category: general
tags: [gitea, systemd, boot, dns, ci-cd]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Gitea Actions Runner: Boot Race Condition Fix
If your `gitea-runner` (act_runner) service fails to start on boot — crash-looping and eventually hitting systemd's restart rate limit — the service is likely starting before DNS is available.
## Symptoms
- `gitea-runner.service` enters a crash loop on boot
- `journalctl -u gitea-runner` shows connection/DNS errors on startup:
```
dial tcp: lookup git.example.com: no such host
```
or similar resolution failures
- Service eventually stops retrying (systemd restart rate limit reached)
- `systemctl status gitea-runner` shows `(Result: start-limit-hit)` after reboot
- Service works fine if started manually after boot completes
## Why It Happens
`After=network.target` only guarantees that the network **interfaces are configured** — not that DNS resolution is functional. systemd-resolved (or your local resolver) starts slightly later. `act_runner` tries to connect to the Gitea instance by hostname on startup, the DNS lookup fails, and the process exits.
With the default `Restart=always` and no `RestartSec`, systemd restarts the service immediately. After 5 rapid failures within the default burst window (10 attempts in 2 minutes), systemd hits the rate limit and stops restarting.
## Fix
### 1. Update the Service File
Edit `/etc/systemd/system/gitea-runner.service`:
```ini
[Unit]
Description=Gitea Actions Runner
After=network-online.target
Wants=network-online.target
[Service]
User=deploy
WorkingDirectory=/opt/gitea-runner
ExecStart=/opt/gitea-runner/act_runner daemon
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
Key changes:
- `After=network-online.target` + `Wants=network-online.target` — waits for full network stack including DNS
- `RestartSec=10` — adds a 10-second delay between restart attempts, preventing rapid failure bursts from hitting the rate limit
### 2. Add a Local /etc/hosts Entry (Optional but Recommended)
If your Gitea instance is on the same local network or reachable via Tailscale, add an entry to `/etc/hosts` so act_runner can resolve it without depending on external DNS:
```
127.0.0.1 git.example.com
```
Replace `git.example.com` with your Gitea hostname and the IP with the correct local address. This makes resolution instantaneous and eliminates the DNS dependency entirely for startup.
### 3. Reload and Restart
```bash
sudo systemctl daemon-reload
sudo systemctl restart gitea-runner
sudo systemctl status gitea-runner
```
Verify it shows `active (running)` and stays that way. Then reboot and confirm it comes up automatically.
## Why `network-online.target` and Not `network.target`
| Target | What it guarantees |
|---|---|
| `network.target` | Network interfaces are configured (IP assigned) |
| `network-online.target` | Network is fully operational (DNS resolvers reachable) |
Services that need to make outbound network connections (especially DNS lookups) on startup should always use `network-online.target`. This includes: mail servers, monitoring agents, CI runners, anything that connects to an external host by name.
> [!note] `network-online.target` can add a few seconds to boot time since systemd waits for the network stack to fully initialize. For server contexts this is always the right tradeoff.
## Related
- [Managing Linux Services with systemd](../01-linux/process-management/managing-linux-services-systemd-ansible.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)

View File

@@ -0,0 +1,63 @@
---
title: "Qwen2.5-14B OOM on RTX 3080 Ti (12GB)"
domain: troubleshooting
category: gpu-display
tags: [gpu, vram, oom, qwen, cuda, fine-tuning]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Qwen2.5-14B OOM on RTX 3080 Ti (12GB)
## Problem
When attempting to run or fine-tune **Qwen2.5-14B** on an NVIDIA RTX 3080 Ti with 12GB of VRAM, the process fails with an Out of Memory (OOM) error:
```
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate X GiB (GPU 0; 12.00 GiB total capacity; Y GiB already allocated; Z GiB free; ...)
```
The 12GB VRAM limit is hit during the initial model load or immediately upon starting the first training step.
## Root Causes
1. **Model Size:** A 14B parameter model in FP16/BF16 requires ~28GB of VRAM just for the weights.
2. **Context Length:** High context lengths (e.g., 4096+) significantly increase VRAM usage during training.
3. **Training Overhead:** Even with QLoRA (4-bit quantization), the overhead of gradients, optimizer states, and activations can exceed 12GB for a 14B model.
---
## Solutions
### 1. Pivot to a 7B Model (Recommended)
For a 12GB GPU, a 7B parameter model (like **Qwen2.5-7B-Instruct**) is the sweet spot. It provides excellent performance while leaving enough VRAM for high context lengths and larger batch sizes.
- **VRAM Usage (7B QLoRA):** ~6-8GB
- **Pros:** Stable, fast, supports long context.
- **Cons:** Slightly lower reasoning capability than 14B.
### 2. Aggressive Quantization
If you MUST run 14B, use 4-bit quantization (GGUF or EXL2) for inference only. Training 14B on 12GB is not reliably possible even with extreme offloading.
```bash
# Example Ollama run (uses 4-bit quantization by default)
ollama run qwen2.5:14b
```
### 3. Training Optimizations (if attempting 14B)
If you have no choice but to try 14B training:
- Set `max_seq_length` to 512 or 1024.
- Use `Unsloth` (it is highly memory-efficient).
- Enable `gradient_checkpointing`.
- Set `per_device_train_batch_size = 1`.
---
## Maintenance
Keep your NVIDIA drivers and CUDA toolkit updated. On Windows (MajorRig), ensure WSL2 has sufficient memory allocation in `.wslconfig`.
---

View File

@@ -1,3 +1,7 @@
---
created: 2026-03-15T06:37
updated: 2026-04-08
---
# 🔧 General Troubleshooting
Practical fixes for common Linux, networking, and application problems.
@@ -8,13 +12,34 @@ Practical fixes for common Linux, networking, and application problems.
## 🌐 Networking & Web
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md)
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](networking/fail2ban-imap-self-ban-mail-client.md)
- [firewalld: Mail Ports Wiped After Reload](networking/firewalld-mail-ports-reset.md)
- [Tailscale SSH: Unexpected Re-Authentication Prompt](networking/tailscale-ssh-reauth-prompt.md)
- [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md)
- [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md)
- [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md)
- [wget/curl: URLs with Special Characters Fail in Bash](wget-url-special-characters.md)
## ⚙️ Ansible & Fleet Management
- [SSH Timeout During dnf upgrade on Fedora Hosts](ansible-ssh-timeout-dnf-upgrade.md)
- [Vault Password File Missing](ansible-vault-password-file-missing.md)
- [ansible.cfg Ignored on WSL2 Windows Mounts](ansible-wsl2-world-writable-mount-ignores-cfg.md)
## 📦 Docker & Systems
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)
- [Gitea Actions Runner: Boot Race Condition Fix](gitea-runner-boot-race-network-target.md)
- [Systemd Session Scope Fails at Login (`session-cN.scope`)](systemd/session-scope-failure-at-login.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
## 🔒 SELinux
- [SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)](selinux-dovecot-vmail-context.md)
## 💾 Storage
- [mdadm RAID Recovery After USB Hub Disconnect](storage/mdadm-usb-hub-disconnect-recovery.md)
## 📝 Application Specific
- [Obsidian Vault Recovery — Loading Cache Hang](obsidian-cache-hang-recovery.md)
- [Gemini CLI Manual Update](gemini-cli-manual-update.md)
## 🤖 AI / Local LLM
- [Ollama Drops Off Tailscale When Mac Sleeps](ollama-macos-sleep-tailscale-disconnect.md)
- [Windows OpenSSH Server (sshd) Stops After Reboot](networking/windows-sshd-stops-after-reboot.md)

View File

@@ -1,3 +1,12 @@
---
title: "ISP SNI Filtering & Caddy Troubleshooting"
domain: troubleshooting
category: general
tags: [isp, sni, caddy, tls, dns, cloudflare]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# ISP SNI Filtering & Caddy Troubleshooting
## 🛑 Problem

View File

@@ -0,0 +1,95 @@
---
title: "macOS Repeating Alert Tone from Mirrored iPhone Notification"
domain: troubleshooting
category: general
tags: [macos, iphone, notifications, continuity, audio]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# macOS Repeating Alert Tone from Mirrored iPhone Notification
## Overview
On macOS, an unacknowledged iPhone notification can be mirrored to a Mac via iPhone Mirroring or Continuity and loop indefinitely — even after the alert is dismissed on the iPhone. The sound plays through the Mac's built-in speakers with no visible notification banner.
## Symptoms
- Repeating alarm tone coming from Mac speakers at regular intervals
- No visible notification in Notification Center or as a banner
- Sound starts when the Mac is opened or woken
- iPhone is silent
- iPhone not connected via Bluetooth audio
## Root Cause
macOS's `ToneLibrary` framework (`TLAlertQueuePlayerController`) loops iPhone alert ringtones via `NotificationCenter` when a mirrored notification has not been dismissed on the Mac side. The iPhone side may show the alert as acknowledged, but the Mac maintains its own notification state independently.
The sound file being looped will be an iPhone ringtone from:
```
/System/Library/PrivateFrameworks/ToneLibrary.framework/Resources/Ringtones/
```
## Diagnosis
To confirm this is the cause, run:
```bash
log stream --predicate 'process == "NotificationCenter"' --style compact
```
Wait for the sound to fire. Look for a line like:
```
URL = file:///System/Library/PrivateFrameworks/ToneLibrary.framework/Resources/Ringtones/<name>.m4r
```
If `TLAlertQueuePlayerController` appears in the output alongside a `.m4r` ringtone URL, this is your issue.
### Full Diagnostic Sequence
If the above isn't conclusive, work through these in order:
**1. Broad scan:**
```bash
log stream --predicate 'eventMessage contains "sound" OR eventMessage contains "alert" OR eventMessage contains "notification"' --style compact
```
**2. CoreAudio scan (catches audio playback):**
```bash
log stream --predicate 'subsystem == "com.apple.coreaudio" OR eventMessage contains "AudioSession" OR eventMessage contains "AVAudio"' --style compact
```
Look for `NotificationCenter` with `contentType = 'soun'` and `Route = built-in speakers`.
**3. Confirm with NotificationCenter filter:**
```bash
log stream --predicate 'process == "NotificationCenter"' --style compact
```
## Fix
### Immediate
Kill and restart NotificationCenter:
```bash
killall -9 NotificationCenter
```
macOS will relaunch it automatically. The looping sound will stop immediately.
### Proper Dismissal
Open Notification Center on the Mac (click the clock in the menu bar) and dismiss any queued notifications from the offending app. If the source notification is still pending on the iPhone, dismiss it there as well.
## Prevention
### Option A — Disable sound for the app on Mac
Go to **System Settings → Notifications → [App Name]** and either:
- Turn off **Play sound for notifications**, or
- Turn off notifications entirely for the app on the Mac
The iPhone will still alert normally.
### Option B — Dismiss on both devices
When a high-priority alert fires on iPhone, check both the iPhone and the Mac's Notification Center to ensure both sides are cleared.
## Notes
- This behaviour can be triggered by any iPhone app whose notifications are mirrored to the Mac via Continuity or iPhone Mirroring, not just apps with alarm-style alerts.
- The Mac maintains its own notification state independently of the iPhone — dismissing on one device does not guarantee dismissal on the other.
- Apps with persistent or repeating alert styles (such as health monitors, timers, or messaging apps) are most likely to trigger this issue.
- Unrelated: Ivory (Mastodon client) may show excessive badge update calls in logs on startup — this is a known Ivory bug and not related to audio playback.
## Tags
`macos` `notifications` `iphone-mirroring` `continuity` `tonelibrary` `notificationcenter` `diagnostic` `audio`
## See Also
- 2026-03-30-MajorAir-CGM-Alert-Loop (diagnostic journal entry)

View File

@@ -1,8 +1,16 @@
---
title: "MajorWiki Setup & Publishing Pipeline"
title: MajorWiki Setup & Publishing Pipeline
domain: troubleshooting
tags: [mkdocs, obsidian, gitea, docker, self-hosting]
date: 2026-03-11
category: general
tags:
- mkdocs
- obsidian
- gitea
- docker
- self-hosting
status: published
created: 2026-03-11
updated: 2026-04-07T10:48
---
# MajorWiki Setup & Publishing Pipeline
@@ -119,3 +127,20 @@ The webhook runs as a systemd service so it survives reboots:
systemctl status majwiki-webhook
systemctl restart majwiki-webhook
```
---
*Updated 2026-03-13: Obsidian Git plugin dropped. See canonical workflow below.*
## Canonical Publishing Workflow
The Obsidian Git plugin was evaluated but dropped — too convoluted for a simple push. Manual git from the terminal is the canonical workflow.
```bash
cd ~/Documents/MajorVault
git add 30-Areas/MajorWiki/
git commit -m "wiki: describe your changes"
git push
```
From there: Gitea receives the push → fires webhook → majorlab pulls → MkDocs rebuilds → `notes.majorshouse.com` updates.

View File

@@ -1,3 +1,12 @@
---
title: "Mail Client Stops Receiving: Fail2ban IMAP Self-Ban"
domain: troubleshooting
category: networking
tags: [fail2ban, imap, dovecot, email, self-ban]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Mail Client Stops Receiving: Fail2ban IMAP Self-Ban
## 🛑 Problem

View File

@@ -0,0 +1,195 @@
---
title: "Apache Outage: Fail2ban Self-Ban + Missing iptables Rules"
domain: troubleshooting
category: networking
tags: [fail2ban, apache, iptables, self-ban, outage]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Apache Outage: Fail2ban Self-Ban + Missing iptables Rules
## 🛑 Problem
A web server running Apache2 becomes completely unreachable (`ERR_CONNECTION_TIMED_OUT`) despite Apache running normally. SSH access via Tailscale is unaffected.
---
## 🔍 Diagnosis
### Step 1 — Confirm Apache is running
```bash
sudo systemctl status apache2
```
If Apache is `active (running)`, the problem is at the firewall layer, not the application.
---
### Step 2 — Test the public IP directly
```bash
curl -I --max-time 5 http://<PUBLIC_IP>
```
A **timeout** means traffic is being dropped by the firewall. A **connection refused** means Apache is down.
---
### Step 3 — Check the iptables INPUT chain
```bash
sudo iptables -L INPUT -n -v
```
Look for ACCEPT rules on ports 80 and 443. If they're missing and the chain policy is `DROP`, HTTP/HTTPS traffic is being silently dropped.
**Example of broken state:**
```
Chain INPUT (policy DROP)
ACCEPT tcp -- lo * ... # loopback only
ACCEPT tcp -- tailscale0 * ... tcp dpt:22
# no rules for port 80 or 443
```
---
### Step 4 — Check the nftables ruleset for Fail2ban
```bash
sudo nft list tables
```
Look for `table inet f2b-table` — this is Fail2ban's nftables table. It operates at **priority `filter - 1`**, meaning it is evaluated *before* the main iptables INPUT chain.
```bash
sudo nft list ruleset | grep -A 10 'f2b-table'
```
Fail2ban rejects banned IPs with rules like:
```
tcp dport { 80, 443 } ip saddr @addr-set-wordpress-hard reject with icmp port-unreachable
```
A banned admin IP will be rejected here regardless of any ACCEPT rules downstream.
---
### Step 5 — Check if your IP is banned
```bash
for jail in $(sudo fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
echo "=== $jail ==="; sudo fail2ban-client get $jail banip | tr ',' '\n' | grep <YOUR_IP>
done
```
---
## ✅ Solution
### Fix 1 — Add missing iptables ACCEPT rules for HTTP/HTTPS
If ports 80/443 are absent from the INPUT chain:
```bash
sudo iptables -I INPUT -i eth0 -p tcp --dport 80 -j ACCEPT
sudo iptables -I INPUT -i eth0 -p tcp --dport 443 -j ACCEPT
```
Persist the rules:
```bash
sudo netfilter-persistent save
```
If `netfilter-persistent` is not installed:
```bash
sudo apt install -y iptables-persistent
sudo netfilter-persistent save
```
---
### Fix 2 — Unban your IP from all Fail2ban jails
```bash
for jail in $(sudo fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
sudo fail2ban-client set $jail unbanip <YOUR_IP> 2>/dev/null && echo "Unbanned from $jail"
done
```
---
### Fix 3 — Add your IP to Fail2ban's ignore list
Edit `/etc/fail2ban/jail.local`:
```bash
sudo nano /etc/fail2ban/jail.local
```
Add or update the `[DEFAULT]` section:
```ini
[DEFAULT]
ignoreip = 127.0.0.1/8 ::1 <YOUR_IP>
```
Restart Fail2ban:
```bash
sudo systemctl restart fail2ban
```
---
## 🔁 Why This Happens
| Issue | Root Cause |
|---|---|
| Missing port 80/443 rules | iptables INPUT chain left incomplete after a manual firewall rework (e.g., SSH lockdown) |
| Still blocked after adding iptables rules | Fail2ban uses a separate nftables table at higher priority — iptables ACCEPT rules are never reached for banned IPs |
| Admin IP gets banned | Automated WordPress/Apache probes trigger Fail2ban jails against the admin's own IP |
---
## ⚠️ Key Architecture Note
On servers running both iptables and Fail2ban, the evaluation order is:
1. **`inet f2b-table`** (nftables, priority `filter - 1`) — Fail2ban ban sets; evaluated first
2. **`ip filter` INPUT chain** (iptables/nftables, policy DROP) — explicit ACCEPT rules
3. **UFW chains** — IP-specific rules; evaluated last
A banned IP is stopped at step 1 and never reaches the ACCEPT rules in step 2. Always check Fail2ban *after* confirming iptables looks correct.
---
## 🔎 Quick Diagnostic Commands
```bash
# Check Apache
sudo systemctl status apache2
# Test public connectivity
curl -I --max-time 5 http://<PUBLIC_IP>
# Check iptables INPUT chain
sudo iptables -L INPUT -n -v
# List nftables tables (look for inet f2b-table)
sudo nft list tables
# Check Fail2ban jail status
sudo fail2ban-client status
# Check a specific jail's banned IPs
sudo fail2ban-client status wordpress-hard
# Unban an IP from all jails
for jail in $(sudo fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
sudo fail2ban-client set $jail unbanip <YOUR_IP> 2>/dev/null && echo "Unbanned from $jail"
done
```

View File

@@ -0,0 +1,167 @@
---
title: "Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS"
domain: troubleshooting
category: networking
tags: [fail2ban, ufw, nftables, vps, performance]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS
## 🛑 Problem
A small VPS (12 GB RAM) running Fail2ban with permanent bans (`bantime = -1`) gradually accumulates thousands of UFW DENY rules or nftables entries. Over time this causes:
- High memory usage from Fail2ban (100+ MB RSS)
- Bloated nftables ruleset (30k+ rules) — every incoming packet must traverse the full list
- Netdata alerts flapping on RAM/swap thresholds
- Degraded packet processing performance
---
## 🔍 Diagnosis
### Step 1 — Check Fail2ban memory and thread count
```bash
grep -E "VmRSS|VmSwap|Threads" /proc/$(pgrep -ox fail2ban-server)/status
```
On a small VPS, Fail2ban RSS over 80 MB is a red flag. Thread count scales with jail count (roughly 2 threads per jail + overhead).
---
### Step 2 — Count nftables/UFW rules
```bash
# Total drop/reject rules in nftables
nft list ruleset | grep -c "reject\|drop"
# UFW rule file size
wc -l /etc/ufw/user.rules
```
A healthy UFW setup has 1030 rules. Thousands means manual `ufw deny` commands or permanent Fail2ban bans have accumulated.
---
### Step 3 — Identify dead jails
```bash
for jail in $(fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
total=$(fail2ban-client status $jail | grep "Total banned" | awk '{print $NF}')
echo "$jail: $total total bans"
done
```
Jails with zero total bans are dead weight — burning threads and regex cycles for nothing.
---
### Step 4 — Check ban policy
```bash
grep bantime /etc/fail2ban/jail.local
```
`bantime = -1` means permanent. On a public-facing server, scanner IPs rotate constantly — permanent bans just pile up with no benefit.
---
## ✅ Solution
### Fix 1 — Disable dead jails
Edit `/etc/fail2ban/jail.local` and set `enabled = false` for any jail with zero historical bans.
### Fix 2 — Switch to time-limited bans
```ini
[DEFAULT]
bantime = 30d
[recidive]
bantime = 90d
```
30 days is long enough to block active campaigns; repeat offenders get 90 days via recidive. Scanner IPs rarely persist beyond a week.
### Fix 3 — Flush accumulated bans
```bash
fail2ban-client unban --all
```
### Fix 4 — Reset bloated UFW rules
**Back up first:**
```bash
cp /etc/ufw/user.rules /etc/ufw/user.rules.bak
cp /etc/ufw/user6.rules /etc/ufw/user6.rules.bak
```
**Reset and re-add only legitimate ALLOW rules:**
```bash
ufw --force reset
ufw default deny incoming
ufw default allow outgoing
ufw allow 443/tcp
ufw allow 80/tcp
ufw allow in on tailscale0 to any port 22 comment "SSH via Tailscale"
# Add any other ALLOW rules specific to your server
ufw --force enable
```
**Restart Fail2ban** so it re-creates its nftables chains:
```bash
systemctl restart fail2ban
```
---
## 🔁 Why This Happens
| Cause | Effect |
|---|---|
| `bantime = -1` (permanent) | Banned IP list grows forever; nftables rules never expire |
| Manual `ufw deny from <IP>` | Each adds a persistent rule to `user.rules`; survives reboots |
| Many jails with no hits | Each jail spawns 2+ threads, runs regex against logs continuously |
| Small VPS (12 GB RAM) | Fail2ban + nftables overhead becomes significant fraction of total RAM |
---
## ⚠️ Key Notes
- **Deleting UFW rules one-by-one is impractical** at scale — `ufw delete` with 30k rules takes hours. A full reset + re-add is the only efficient path.
- **`ufw --force reset` also resets `before.rules` and `after.rules`** — UFW auto-backs these up, but verify your custom chains if any exist.
- **After flushing bans, expect a brief spike in 4xx responses** as scanners that were previously blocked hit Apache again. Fail2ban will re-ban them within minutes.
- **The Netdata `web_log_1m_successful` alert may fire** during this window — it will self-clear once bans repopulate.
---
## 🔎 Quick Diagnostic Commands
```bash
# Fail2ban memory usage
grep -E "VmRSS|VmSwap|Threads" /proc/$(pgrep -ox fail2ban-server)/status
# Count nftables rules
nft list ruleset | grep -c "reject\|drop"
# UFW rule count
ufw status numbered | tail -1
# List all jails with ban counts
for jail in $(fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
banned=$(fail2ban-client status $jail | grep "Currently banned" | awk '{print $NF}')
total=$(fail2ban-client status $jail | grep "Total banned" | awk '{print $NF}')
echo "$jail: $banned current / $total total"
done
# Flush all bans
fail2ban-client unban --all
```

View File

@@ -0,0 +1,79 @@
---
title: "firewalld: Mail Ports Wiped After Reload (IMAP + Webmail Outage)"
domain: troubleshooting
category: networking
tags: [firewalld, mail, imap, fedora, ports]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# firewalld: Mail Ports Wiped After Reload (IMAP + Webmail Outage)
If IMAP, SMTP, and webmail all stop working simultaneously on a Fedora/RHEL mail server, firewalld may have reloaded and lost its mail port configuration.
## Symptoms
- `openssl s_client -connect mail.example.com:993` returns `Connection refused`
- Webmail returns connection refused or times out
- SSH still works (port 22 is typically in the persisted config)
- `firewall-cmd --list-services --zone=public` shows only `ssh dhcpv6-client mdns` or similar — no mail services
- Mail was working before a service restart or system event
## Why It Happens
firewalld uses two layers of configuration:
- **Runtime** — active rules in memory (lost on reload or restart)
- **Permanent** — written to `/etc/firewalld/zones/public.xml` (survives reloads)
If mail ports were added with `firewall-cmd --add-service=imaps` (without `--permanent`), they exist only in the runtime config. Any event that triggers a `firewall-cmd --reload` — including Fail2ban restarting, a system update, or manual reload — wipes the runtime config back to the permanent state, dropping all non-permanent rules.
## Diagnosis
```bash
# Check what's currently allowed
firewall-cmd --list-services --zone=public
# Check nftables for catch-all reject rules
nft list ruleset | grep -E '(reject|accept|993|143)'
# Test port 993 from an external machine
openssl s_client -connect mail.example.com:993 -brief
```
If the only services listed are `ssh` and the port test shows `Connection refused`, the rules are gone.
## Fix
Add all mail services permanently and reload:
```bash
firewall-cmd --permanent \
--add-service=smtp \
--add-service=smtps \
--add-service=smtp-submission \
--add-service=imap \
--add-service=imaps \
--add-service=http \
--add-service=https
firewall-cmd --reload
# Verify
firewall-cmd --list-services --zone=public
```
Expected output:
```
dhcpv6-client http https imap imaps mdns smtp smtp-submission smtps ssh
```
## Key Notes
- **Always use `--permanent`** when adding services to firewalld on a server. Without it, the rule exists only until the next reload.
- **Fail2ban + firewalld**: Fail2ban uses firewalld as its ban backend (`firewallcmd-rich-rules`). When Fail2ban restarts or crashes, it may trigger a `firewall-cmd --reload`, resetting any runtime-only rules.
- **Verify after any firewall event**: After Fail2ban restarts, system reboots, or `firewall-cmd --reload`, always confirm mail services are still present with `firewall-cmd --list-services --zone=public`.
- **Check the permanent config directly**: `cat /etc/firewalld/zones/public.xml` — if mail services aren't in this file, they'll be lost on next reload.
## Related
- [Linux Server Hardening Checklist](../../02-selfhosting/security/linux-server-hardening-checklist.md)
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](fail2ban-imap-self-ban-mail-client.md)

View File

@@ -0,0 +1,75 @@
---
title: "Tailscale SSH: Unexpected Re-Authentication Prompt"
domain: troubleshooting
category: networking
tags: [tailscale, ssh, authentication, vpn]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Tailscale SSH: Unexpected Re-Authentication Prompt
If a Tailscale SSH connection unexpectedly presents a browser authentication URL mid-session, the first instinct is to check the ACL policy. However, this is often a one-off Tailscale hiccup rather than a misconfiguration.
## Symptoms
- SSH connection to a fleet node displays a Tailscale auth URL:
```
To authenticate, visit: https://login.tailscale.com/a/xxxxxxxx
```
- The prompt appears even though the node worked fine previously
- Other nodes in the fleet connect without prompting
## What Causes It
Tailscale SSH supports two ACL `action` values:
| Action | Behavior |
|---|---|
| `accept` | Trusts Tailscale identity — no additional auth required |
| `check` | Requires periodic browser-based re-authentication |
If `action: "check"` is set, every session (or after token expiry) will prompt for browser auth. However, even with `action: "accept"`, a one-off prompt can appear due to a Tailscale daemon glitch or key refresh event.
## How to Diagnose
### 1. Verify the ACL policy
In the Tailscale admin console (or via `tailscale debug acl`), inspect the SSH rules. For a trusted homelab fleet, the rule should use `accept`:
```json
{
"src": ["autogroup:member"],
"dst": ["autogroup:self"],
"users": ["autogroup:nonroot", "root"],
"action": "accept",
}
```
If `action` is `check`, that is the root cause — change it to `accept` for trusted source/destination pairs.
### 2. Confirm it was a one-off
If the ACL already shows `accept`, the prompt was transient. Test with:
```bash
ssh <hostname> "echo ok"
```
No auth prompt + `ok` output = resolved. Note that this test is only meaningful if the previous session's auth token has expired, or you test from a different device that hasn't recently authenticated.
## Fix
**If ACL shows `check`:** Change to `accept` in the Tailscale admin console under Access Controls. Takes effect immediately — no server changes needed.
**If ACL already shows `accept`:** No action required. The prompt was a one-off Tailscale event (daemon restart, key refresh, etc.). Monitor for recurrence.
## Notes
- Port 2222 on **MajorRig** exists as a hard bypass for Tailscale SSH browser auth — regular SSH over Tailscale network, bypassing Tailscale SSH entirely. This is an alternative approach if `check` mode is required for compliance but browser auth is too disruptive.
- The `autogroup:self` destination means the rule applies when connecting from your own devices to your own devices — appropriate for a personal homelab fleet.
## Related
- Network Overview — Tailscale fleet inventory and SSH access model
- SSH-Aliases — Fleet SSH access shortcuts

View File

@@ -0,0 +1,93 @@
---
title: "Windows OpenSSH: WSL as Default Shell Breaks Remote Commands"
domain: troubleshooting
category: networking
tags:
- windows
- openssh
- wsl
- ssh
- majorrig
- powershell
status: published
created: 2026-04-03
updated: 2026-04-07T21:55
---
# Windows OpenSSH: WSL as Default Shell Breaks Remote Commands
## Problem
SSH remote commands fail with:
```
Invalid command line argument: -c
Please use 'wsl.exe --help' to get a list of supported arguments.
```
This happens on **every** remote command — `ssh-copy-id`, `ssh user@host "command"`, `scp`, etc. Interactive SSH (no command) may still work if it drops into WSL.
## Cause
Windows OpenSSH's default shell is set to `C:\Windows\System32\wsl.exe`. When SSH executes a remote command, it invokes:
```
<default_shell> -c "<command>"
```
But `wsl.exe` does not accept the `-c` flag. It expects `-e` for command execution, or no flags for an interactive session. Since OpenSSH hardcodes `-c`, every remote command fails.
## Fix — Option 1: Use `bash.exe` (Recommended)
`bash.exe` is a WSL shim that **does** accept the `-c` flag. This gives you a Linux-first SSH experience where both interactive sessions and remote commands work natively.
```powershell
# Find the actual path to bash.exe (it varies by install)
Get-Command bash.exe | Select-Object Source
# Set it as the default shell (elevated PowerShell)
Set-ItemProperty -Path "HKLM:\SOFTWARE\OpenSSH" -Name DefaultShell -Value "C:\Users\<username>\AppData\Local\Microsoft\WindowsApps\bash.exe"
Restart-Service sshd
```
> **Note:** `bash.exe` may not be at `C:\Windows\System32\bash.exe` on all installs. Always verify the path with `Get-Command` first — the Windows Store WSL install places it under `AppData\Local\Microsoft\WindowsApps\`.
### After the fix (bash.exe)
- Interactive SSH sessions land directly in your WSL distro
- Remote SSH commands execute in WSL's bash — Linux commands work natively
- `ssh user@host "uname -s"` returns `Linux`
## Fix — Option 2: Revert to PowerShell
If you need Windows-native command execution over SSH (e.g., for Windows-targeted Ansible or remote PowerShell administration), set the default shell back to PowerShell:
```powershell
Set-ItemProperty -Path "HKLM:\SOFTWARE\OpenSSH" -Name DefaultShell -Value "C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe"
Restart-Service sshd
```
### After the fix (PowerShell)
- Remote SSH commands execute via PowerShell
- To run Linux commands, prefix with `wsl`:
```bash
ssh user@host "wsl bash -c 'cd /mnt/d/project && git pull'"
```
- Interactive SSH sessions land in PowerShell (type `wsl` to enter Linux)
- `ssh-copy-id` still won't work for WSL's `authorized_keys` — Windows OpenSSH reads from `C:\Users\<user>\.ssh\authorized_keys`, not the WSL home directory
## Key Notes
- This registry key (`HKLM:\SOFTWARE\OpenSSH\DefaultShell`) is the **only** supported way to change the OpenSSH default shell on Windows
- The change persists across reboots and Windows Updates
- `wsl.exe` does **not** support `-c` — never use it as the default shell
- `bash.exe` **does** support `-c` — use it for a Linux-first SSH experience
- The path to `bash.exe` varies by install method — always verify with `Get-Command bash.exe`
- Tools like Ansible, `scp`, `rsync`, and `ssh-copy-id` all depend on `-c` working
- If the shell path in the registry doesn't exist on disk, sshd will reject the user entirely with `User <name> not allowed because shell <path> does not exist` — check `Get-WinEvent -LogName OpenSSH/Operational` to diagnose
## Related
- [Windows OpenSSH Server (sshd) Stops After Reboot](windows-sshd-stops-after-reboot.md) — sshd service startup issues
- [Microsoft Docs: OpenSSH DefaultShell](https://learn.microsoft.com/en-us/windows-server/administration/openssh/openssh-server-configuration#configuring-the-default-shell-for-openssh-in-windows)

View File

@@ -0,0 +1,82 @@
---
title: Windows OpenSSH Server (sshd) Stops After Reboot
domain: troubleshooting
category: networking
tags:
- windows
- openssh
- sshd
- reboot
- majorrig
status: published
created: 2026-04-02
updated: 2026-04-07T21:58
---
# Windows OpenSSH Server (sshd) Stops After Reboot
## 🛑 Problem
SSH connections to MajorRig from a mobile device or Tailscale client time out on port 22. No connection refused error — just a timeout. The OpenSSH Server service is installed but not running.
---
## 🔍 Diagnosis
From an **elevated** PowerShell on MajorRig:
```powershell
Get-Service sshd
```
If the output shows `Stopped`, the service is not running. This is the cause of the timeout.
---
## ✅ Fix
Run the following from an **elevated** PowerShell (Win+X → Terminal (Admin)):
```powershell
Start-Service sshd
Set-Service -Name sshd -StartupType Automatic
Get-Service sshd
```
The final command should confirm `Running`. SSH connections will resume immediately — no reboot required.
---
## 🔄 Why This Happens
| Trigger | Reason |
|---|---|
| Windows Update reboot | If `sshd` startup type is Manual, it won't restart after a reboot |
| WSL2 export/import/rebuild | WSL2 reinstall operations often involve reboots that expose the same issue |
| Fresh Windows install | OpenSSH Server is installed but startup type defaults to Manual |
The Windows OpenSSH Server is installed as a Windows Feature (`Add-WindowsCapability`), not a WSL2 package. It runs entirely on the Windows side. However, its **default startup type is Manual**, meaning it will not survive a reboot unless explicitly set to Automatic.
---
## ⚠️ Key Notes
- **This is a Windows-side issue** — WSL2 itself is unaffected. The service must be started and configured from Windows, not from within WSL2.
- **Elevated PowerShell required** — `Start-Service` and `Set-Service` for sshd will return "Access is denied" if run without Administrator privileges.
- **Port 2222 was retired (2026-03-25)** — the bypass port 2222 on MajorRig is no longer in use. The entire fleet now uses port 22 uniformly after the Tailscale SSH auth fix. Only port 22 needs to be verified when troubleshooting sshd.
- **Default shell still works once fixed** — MajorRig's sshd is configured to use `bash.exe` (WSL shim) as the default shell, dropping SSH sessions directly into WSL2/Bash. This config is preserved across service restarts. See [WSL default shell troubleshooting](windows-openssh-wsl-default-shell-breaks-remote-commands.md) for why `bash.exe` is used instead of `wsl.exe`.
---
## 🔎 Quick Reference
```powershell
# Check status (run as Admin)
Get-Service sshd
# Start and set to auto-start (run as Admin)
Start-Service sshd
Set-Service -Name sshd -StartupType Automatic
# Verify firewall rule exists
Get-NetFirewallRule -DisplayName "*ssh*" | Select DisplayName, Enabled, Direction, Action
```

View File

@@ -1,11 +1,11 @@
---
tags:
- obsidian
- troubleshooting
- windows
- majortwin
created: '2026-03-11'
status: resolved
title: "Obsidian Vault Recovery — Loading Cache Hang"
domain: troubleshooting
category: general
tags: [obsidian, troubleshooting, windows, majortwin]
status: published
created: 2026-03-11
updated: 2026-04-02
---
# Obsidian Vault Recovery — Loading Cache Hang

View File

@@ -0,0 +1,68 @@
---
title: "Ollama Drops Off Tailscale When Mac Sleeps"
domain: troubleshooting
category: ai-inference
tags: [ollama, tailscale, macos, sleep, open-webui, majormac]
status: published
created: 2026-03-17
updated: 2026-03-17
---
# Ollama Drops Off Tailscale When Mac Sleeps
Open WebUI loses its Ollama connection when the host Mac goes to sleep. Models stop appearing, and curl to the Ollama API times out from other machines on the tailnet.
## The Short Answer
Disable sleep when plugged into AC power:
```bash
sudo pmset -c sleep 0
```
Or via **System Settings → Energy → Prevent automatic sleeping when the display is off**.
## Background
macOS suspends network interfaces when the machine sleeps, which drops the Tailscale tunnel. Ollama becomes unreachable over the tailnet even though it was running fine before sleep. Open WebUI doesn't reconnect automatically — it just shows no models until the connection is manually refreshed after the Mac wakes.
The `-c` flag in `pmset` limits the setting to AC power only, so the machine will still sleep normally on battery.
## Diagnosis
From any other machine on the tailnet:
```bash
tailscale status | grep majormac
```
If it shows `offline, last seen Xm ago` or is routing through a relay instead of direct, the Mac is asleep or the tunnel is degraded.
```bash
curl http://100.74.124.81:11434/api/tags
```
Timeout = Ollama unreachable. After waking the Mac, this should return a JSON list of models immediately.
## Fix
```bash
# Disable sleep on AC power (run on MajorMac)
sudo pmset -c sleep 0
# Verify
pmset -g | grep sleep
```
The display can still sleep — only system sleep needs to be off for Ollama and Tailscale to stay available.
## Gotchas & Notes
- **Display sleep is fine** — `pmset -c displaysleep 15` or whatever you prefer won't affect Ollama availability.
- **Battery behavior unchanged** — `-c` flag means AC only; normal sleep on battery is preserved.
- **Open WebUI won't auto-reconnect** — after waking the Mac, go to Settings → Connections and hit the verify button, or just reload the page.
- This affects any service bound to the Tailscale interface on MajorMac, not just Ollama.
## See Also
- MajorMac — device config and known issues

View File

@@ -0,0 +1,122 @@
---
title: "Custom Fail2ban Jail: Apache Directory Scanning & Junk Methods"
domain: troubleshooting
category: security
tags: [fail2ban, apache, security, bots, wordpress]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Custom Fail2ban Jail: Apache Directory Scanning & Junk Methods
## 🛑 Problem
Bots and vulnerability scanners enumerate WordPress directories (`/wp-admin/`, `/wp-includes/`, `/wp-content/`), probe for access-denied paths, or send junk HTTP methods (e.g., `YQEILVHZ`, `DUTEDCEM`). These generate Apache error log entries but are not caught by any default Fail2ban jail:
- `AH01276` — directory index forbidden (autoindex:error)
- `AH01630` — client denied by server configuration (authz_core:error)
- `AH00135` — invalid method in request (core:error)
The result is a low success ratio on Netdata's `web_log_1m_successful` metric and wasted server resources processing scanner requests.
---
## ✅ Solution
### Step 1 — Create the filter
Create `/etc/fail2ban/filter.d/apache-dirscan.conf`:
```ini
# Fail2ban filter for Apache scanning/probing
# Catches: directory enumeration (AH01276), access denied (AH01630), invalid methods (AH00135)
[Definition]
failregex = ^\[.*\] \[autoindex:error\] \[pid \d+\] \[client <HOST>:\d+\] AH01276:
^\[.*\] \[authz_core:error\] \[pid \d+\] \[client <HOST>:\d+\] AH01630:
^\[.*\] \[core:error\] \[pid \d+\] \[client <HOST>:\d+\] AH00135:
ignoreregex =
```
### Step 2 — Add the jail
Add to `/etc/fail2ban/jail.local`:
```ini
[apache-dirscan]
enabled = true
port = http,https
filter = apache-dirscan
logpath = /var/log/apache2/error.log
maxretry = 3
findtime = 60
```
Three hits in 60 seconds is aggressive enough to catch active scanners while avoiding false positives from legitimate 403s.
### Step 3 — Test the regex
```bash
fail2ban-regex /var/log/apache2/error.log /etc/fail2ban/filter.d/apache-dirscan.conf
```
This shows match counts per regex line and any missed lines.
### Step 4 — Reload Fail2ban
```bash
fail2ban-client reload
fail2ban-client status apache-dirscan
```
---
## 🔍 What Each Pattern Catches
| Error Code | Apache Module | Trigger |
|---|---|---|
| `AH01276` | `autoindex:error` | Bot requests a directory with no index file and `Options -Indexes` is set. Classic WordPress/CMS directory enumeration. |
| `AH01630` | `authz_core:error` | Request denied by `<Directory>` or `<Location>` rules (e.g., probing `/wp-content/plugins/`). |
| `AH00135` | `core:error` | Request uses a garbage HTTP method that Apache can't parse. Scanners use these to fingerprint servers. |
---
## 🔁 Why Default Jails Miss This
| Default Jail | What It Catches | Gap |
|---|---|---|
| `apache-badbots` | Bad User-Agent strings in access log | Doesn't look at error log; many scanners use normal UAs |
| `apache-botsearch` | 404s for common exploit paths | Only matches access log 404s, not error log entries |
| `apache-noscript` | Requests for non-existent scripts | Narrow regex, doesn't cover directory probes |
| `apache-overflows` | Long request URIs | Only catches buffer overflow attempts |
| `apache-invaliduri` | `AH10244` invalid URI encoding | Different error code — catches URL-encoded traversal, not directory scanning |
The `apache-dirscan` filter fills the gap by monitoring the error log for the three most common scanner signatures that slip through all default jails.
---
## ⚠️ Key Notes
- **`logpath` must point to the error log**, not the access log. All three patterns are logged to `error.log`.
- **Adjust `logpath`** for your distribution: Debian/Ubuntu uses `/var/log/apache2/error.log`, RHEL/Fedora uses `/var/log/httpd/error_log`.
- **The `allowipv6` warning** on reload is cosmetic (Fail2ban 1.0+) and can be ignored.
- **Pair with `recidive`** to escalate repeat offenders to longer bans.
---
## 🔎 Quick Diagnostic Commands
```bash
# Test filter against current error log
fail2ban-regex /var/log/apache2/error.log /etc/fail2ban/filter.d/apache-dirscan.conf
# Check jail status
fail2ban-client status apache-dirscan
# Watch bans in real time
tail -f /var/log/fail2ban.log | grep apache-dirscan
# Count current error types
grep -c "AH01276\|AH01630\|AH00135" /var/log/apache2/error.log
```

View File

@@ -0,0 +1,82 @@
---
title: "ClamAV Safe Scheduling on Live Servers"
domain: troubleshooting
category: security
tags: [clamav, cpu, nice, ionice, cron, vps]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# ClamAV Safe Scheduling on Live Servers
Running `clamscan` unthrottled on a live server will peg CPU until completion. On a small VPS (1 vCPU), a full recursive scan can sustain 70100% CPU for an hour or more, degrading or taking down hosted services.
## The Problem
A common out-of-the-box ClamAV cron setup looks like this:
```cron
0 1 * * 0 clamscan --infected --recursive / --exclude=/sys
```
This runs at Linux's default scheduling priority (`nice 0`) with normal I/O priority. On a live server it will:
- Monopolize the CPU for the scan duration
- Cause high I/O wait, degrading web serving, databases, and other services
- Trigger monitoring alerts (e.g., Netdata `10min_cpu_usage`)
## The Fix
Throttle the scan with `nice` and `ionice`:
```cron
0 1 * * 0 nice -n 19 ionice -c 3 clamscan --infected --recursive / --exclude=/sys
```
| Flag | Meaning |
|------|---------|
| `nice -n 19` | Lowest CPU scheduling priority (range: -20 to 19) |
| `ionice -c 3` | Idle I/O class — only uses disk when no other process needs it |
The scan will take longer but will not impact server performance.
## Applying the Fix
Edit root's crontab:
```bash
crontab -e
```
Or apply non-interactively:
```bash
crontab -l | sed 's|clamscan|nice -n 19 ionice -c 3 clamscan|' | crontab -
```
Verify:
```bash
crontab -l | grep clam
```
## Diagnosing a Runaway Scan
If CPU is already pegged, identify and kill the process:
```bash
ps aux --sort=-%cpu | head -15
# Look for clamscan
kill <PID>
```
## Notes
- `ionice -c 3` (Idle) requires Linux kernel ≥ 2.6.13 and CFQ/BFQ I/O scheduler. Works on most Ubuntu/Debian/Fedora systems.
- On multi-core servers, consider also using `cpulimit` for a hard cap: `cpulimit -l 30 -- clamscan ...`
- Always keep `--exclude=/sys` (and optionally `--exclude=/proc`, `--exclude=/dev`) to avoid scanning virtual filesystems.
## Related
- [ClamAV Documentation](https://docs.clamav.net/)
- [Linux Server Hardening Checklist](../../02-selfhosting/security/linux-server-hardening-checklist.md)

View File

@@ -0,0 +1,112 @@
---
title: "SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)"
domain: troubleshooting
category: general
tags: [selinux, dovecot, mail, fedora, vmail]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)
If Dovecot is generating SELinux AVC denials and mail delivery or retrieval is broken on a Fedora/RHEL system with SELinux enforcing, the `/var/vmail` directory tree likely has incorrect file contexts.
## Symptoms
- Thousands of AVC denials in `/var/log/audit/audit.log` for Dovecot processes
- Denials reference `var_t` context on files under `/var/vmail/`
- Mail delivery may fail silently; IMAP folders may appear empty or inaccessible
- `ausearch -m avc -ts recent` shows denials like:
```
type=AVC msg=audit(...): avc: denied { write } for pid=... comm="dovecot" name="..." scontext=system_u:system_r:dovecot_t:s0 tcontext=system_u:object_r:var_t:s0
```
## Why It Happens
SELinux requires files to have the correct security context for the process that accesses them. When Postfix/Dovecot are installed on a fresh system and `/var/vmail` is created manually (or by the mail stack installer), the directory may inherit the default `var_t` context from `/var/` rather than the mail-specific `mail_spool_t` context Dovecot expects.
The correct context for the entire `/var/vmail` tree is `mail_spool_t` — including the `tmp/` subdirectories inside each Maildir folder.
> [!warning] Do NOT apply `dovecot_tmp_t` to Maildir `tmp/` directories
> `dovecot_tmp_t` is for Dovecot's own process-level temp files, not for Maildir `tmp/` folders. Postfix's virtual delivery agent writes to `tmp/` when delivering new mail. Applying `dovecot_tmp_t` will block Postfix from delivering any mail, silently deferring all messages with `Permission denied`.
## Fix
### 1. Check Current Context
```bash
ls -Zd /var/vmail/
ls -Z /var/vmail/example.com/user/
ls -Zd /var/vmail/example.com/user/tmp/
```
If you see `var_t` instead of `mail_spool_t`, the contexts need to be set. If you see `dovecot_tmp_t` on `tmp/`, that needs to be corrected too.
### 2. Define the Correct File Context Rule
One rule covers everything — including `tmp/`:
```bash
sudo semanage fcontext -a -t mail_spool_t "/var/vmail(/.*)?"
```
If you previously added a `dovecot_tmp_t` rule for `tmp/` directories, remove it:
```bash
# Check for an erroneous dovecot_tmp_t rule
sudo semanage fcontext -l | grep vmail
# If you see one like "/var/vmail(/.*)*/tmp(/.*)?" with dovecot_tmp_t, delete it:
sudo semanage fcontext -d "/var/vmail(/.*)*/tmp(/.*)?"
```
### 3. Apply the Labels
```bash
sudo restorecon -Rv /var/vmail
```
This relabels all existing files. On a mail server with many users and messages, this may take a moment and will print every relabeled path.
### 4. Verify
```bash
ls -Zd /var/vmail/
ls -Zd /var/vmail/example.com/user/tmp/
```
Both should show `mail_spool_t`:
```
system_u:object_r:mail_spool_t:s0 /var/vmail/
system_u:object_r:mail_spool_t:s0 /var/vmail/example.com/user/tmp/
```
### 5. Flush Deferred Mail
If mail was queued while the context was wrong, flush it:
```bash
postqueue -f
postqueue -p # should be empty shortly
```
### 6. Check That Denials Stopped
```bash
ausearch -m avc -ts recent | grep dovecot
```
No output = no new denials.
## Key Notes
- **One rule is enough** — `"/var/vmail(/.*)?"` with `mail_spool_t` covers every file and directory under `/var/vmail`, including all `tmp/` subdirectories.
- **`semanage fcontext` is persistent** — the rules survive reboots and `restorecon` calls. You only need to run `semanage` once.
- **`restorecon` applies current rules to existing files** — run it after any `semanage` change and any time you manually create directories.
- **New mail directories are labeled automatically** — SELinux applies the registered `semanage` rules to any new files created under `/var/vmail`.
- **`var_t` context is the default for `/var/`** — any directory created under `/var/` without a specific `semanage` rule will inherit `var_t`. This is almost never correct for service data directories.
## Related
- [Linux Server Hardening Checklist](../02-selfhosting/security/linux-server-hardening-checklist.md)
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)

View File

@@ -0,0 +1,114 @@
---
title: "mdadm RAID Recovery After USB Hub Disconnect"
domain: troubleshooting
category: storage
tags: [mdadm, raid, usb, storage, recovery]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# mdadm RAID Recovery After USB Hub Disconnect
A software RAID array managed by mdadm can appear to catastrophically fail when the drives are connected via USB rather than SATA. The array is fine — the hub dropped out. Here's how to diagnose and recover.
## Symptoms
- rsync or other I/O to the RAID mount returns `Input/output error`
- `cat /proc/mdstat` shows `broken raid0` or `FAILED`
- `mdadm --detail /dev/md0` shows `State: broken, FAILED`
- `lsblk` no longer lists the RAID member drives (e.g. `sdd`, `sde` gone)
- XFS (or other filesystem) logs in dmesg:
```
XFS (md0): log I/O error -5
XFS (md0): Filesystem has been shut down due to log error (0x2).
```
- `smartctl -H /dev/sdd` returns `No such device`
## Why It Happens
If your RAID drives are in a USB enclosure (e.g. TerraMaster via ASMedia hub), a USB disconnect — triggered by a power fluctuation, plugging in another device, or a hub reset — causes mdadm to see the drives disappear. mdadm cannot distinguish a USB dropout from a physical drive failure, so it declares the array failed.
The failure message in dmesg will show `hostbyte=DID_ERROR` rather than a drive-level error:
```
md/raid0md0: Disk failure on sdd1 detected, failing array.
sd X:0:0:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
```
`DID_ERROR` means the SCSI host adapter (USB controller) reported the error — the drives themselves are likely fine.
## Diagnosis
### 1. Check if the USB hub recovered
```bash
lsblk -o NAME,SIZE,TYPE,FSTYPE,MODEL
```
After a hub reconnects, drives will reappear — often with **new device names** (e.g. `sdd`/`sde` become `sdg`/`sdh`). Look for drives with `linux_raid_member` filesystem type.
```bash
dmesg | grep -iE 'usb|disconnect|DID_ERROR' | tail -30
```
A hub dropout looks like multiple devices disconnecting at the same time on the same USB port.
### 2. Confirm drives have intact superblocks
```bash
mdadm --examine /dev/sdg1
mdadm --examine /dev/sdh1
```
If the superblocks are present and show matching UUID/array info, the data is intact.
## Recovery
### 1. Unmount and stop the degraded array
```bash
umount /majorRAID # or wherever md0 is mounted
mdadm --stop /dev/md0
```
If umount fails due to a busy mount or already-failed filesystem, it may already be unmounted by the kernel. Proceed with `--stop`.
### 2. Reassemble with the new device names
```bash
mdadm --assemble /dev/md0 /dev/sdg1 /dev/sdh1
```
mdadm matches drives by their superblock UUID, not device name. As long as both drives are present the assembly will succeed regardless of what they're called.
### 3. Mount and verify
```bash
mount /dev/md0 /majorRAID
df -h /majorRAID
ls /majorRAID
```
If the filesystem mounts and data is visible, recovery is complete.
### 4. Create or update /etc/mdadm.conf
If `/etc/mdadm.conf` doesn't exist (or references old device names), update it:
```bash
mdadm --detail --scan > /etc/mdadm.conf
cat /etc/mdadm.conf
```
The output uses UUID rather than device names — the array will reassemble correctly on reboot even if drive letters change again.
## Prevention
The root cause is drives on USB rather than SATA. Short of moving the drives to a SATA controller, options are limited. When planning a migration off the RAID array (e.g. to SnapRAID + MergerFS), prioritize getting drives onto SATA connections.
> [!warning] RAID 0 has no redundancy. A USB dropout that causes the array to fail mid-write could corrupt data even if the drives themselves are healthy. Keep current backups before any maintenance involving the enclosure.
## Related
- [SnapRAID & MergerFS Storage Setup](../../01-linux/storage/snapraid-mergerfs-setup.md)
- [rsync Backup Patterns](../../02-selfhosting/storage-backup/rsync-backup-patterns.md)

View File

@@ -0,0 +1,102 @@
---
title: "Systemd Session Scope Fails at Login (session-cN.scope)"
domain: troubleshooting
category: systemd
tags: [systemd, ssh, login, session, linux]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Systemd Session Scope Fails at Login (`session-cN.scope`)
After SSH login, systemd reports a failed transient unit like `session-c1.scope`. The MOTD or login banner shows `Failed Units: 1 — session-c1.scope`. This is a harmless race condition, not a real service failure.
## Symptoms
- Login banner or MOTD displays:
```
Failed Units: 1
session-c1.scope
```
- `systemctl list-units --failed` shows one or more `session-cN.scope` units in a failed state
- The system is otherwise healthy — no services are actually broken
## What Causes It
A transient session scope is created by systemd-logind every time a user logs in (SSH, console, etc.). The scope tracks the login session's process group via cgroups.
The failure occurs when a login process (PID) exits before systemd can move it into the target cgroup. This is a race condition triggered by:
- **Short-lived SSH connections** — automated probes, health checks, or monitoring tools that connect and immediately disconnect
- **Sessions that disconnect before PAM completes** — network interruptions or aggressive client timeouts
- **Cron jobs or scripts** that create transient SSH sessions
systemd logs the sequence:
1. `PID N vanished before we could move it to target cgroup`
2. `No PIDs left to attach to the scope's control group, refusing.`
3. Unit enters `failed (Result: resources)` state
Because session scopes are transient (not backed by a unit file), the failed state lingers until manually cleared.
## How to Diagnose
### 1. Check the failed unit
```bash
systemctl status session-c1.scope
```
Look for:
```
Active: failed (Result: resources)
```
And in the log output:
```
PID <N> vanished before we could move it to target cgroup
No PIDs left to attach to the scope's control group, refusing.
```
### 2. Confirm no real failures
```bash
systemctl list-units --failed
```
If the only failed units are `session-cN.scope` entries, the system is healthy.
## Fix
Reset the failed unit:
```bash
systemctl reset-failed session-c1.scope
```
To clear all failed session scopes at once:
```bash
systemctl reset-failed 'session-*.scope'
```
Verify:
```bash
systemctl list-units --failed
```
Should report 0 failed units.
## Notes
- This is a known systemd behavior and not indicative of a real problem. It can be safely ignored or cleared whenever it appears.
- If it recurs frequently, investigate what is creating short-lived SSH sessions — common culprits include monitoring agents (Netdata, Nagios), automated backup scripts, or SSH brute-force attempts.
- The `c` in `session-c1.scope` indicates a **console/SSH session** (as opposed to graphical sessions which use different prefixes). The number increments with each new session.
- Applies to **Fedora, Ubuntu, and any systemd-based Linux distribution**.
## Related
- [gitea-runner-boot-race-network-target](../gitea-runner-boot-race-network-target.md) — Another systemd race condition involving service startup ordering

View File

@@ -0,0 +1,100 @@
---
title: "wget/curl: URLs with Special Characters Fail in Bash"
domain: troubleshooting
category: general
tags: [wget, curl, bash, shell, quoting, url]
status: published
created: 2026-04-08
updated: 2026-04-08
---
# wget/curl: URLs with Special Characters Fail in Bash
## Problem
Downloading a URL that contains `&`, `=`, `#`, `?`, or other shell-meaningful characters fails with cryptic errors when the URL is not properly quoted:
```bash
wget -O output.mp4 https://cdn.example.com/video%20file.mp4?secure=abc123&token=xyz
```
Bash interprets `&` as a background operator, splitting the command:
```
bash: token=xyz: command not found
```
The download either fails outright or downloads only a partial/error page (e.g., 868 bytes instead of 2 GB).
---
## Root Cause
Bash treats several URL-common characters as shell operators:
| Character | Shell Meaning | URL Meaning |
|---|---|---|
| `&` | Run previous command in background | Query parameter separator |
| `?` | Single-character glob wildcard | Start of query string |
| `#` | Comment (rest of line ignored) | Fragment identifier |
| `=` | Variable assignment (in some contexts) | Key-value separator |
| `%` | Job control (`%1`, `%2`) | URL encoding prefix |
| `!` | History expansion (in interactive shells) | Rarely used in URLs |
If the URL is unquoted, Bash processes these characters before `wget` or `curl` ever sees them.
---
## Fix
**Always single-quote URLs** passed to `wget` or `curl`:
```bash
wget -O '/path/to/output file.mp4' 'https://cdn.example.com/path/video%20file.mp4?secure=abc123&token=xyz'
```
Single quotes prevent **all** shell interpretation — no variable expansion, no globbing, no operator parsing. The URL reaches `wget` exactly as written.
### When to use double quotes instead
If the URL contains a shell variable (e.g., a token stored in `$TOKEN`), use double quotes:
```bash
wget -O output.mp4 "https://cdn.example.com/file.mp4?secure=${TOKEN}&expires=9999"
```
Double quotes allow variable expansion but still protect `&`, `?`, and `#` from shell interpretation.
### Output filename quoting
The `-O` filename also needs quoting if it contains spaces or special characters:
```bash
wget -O '/plex/plex/DF Direct Q+A #258.mp4' 'https://example.com/video.mp4'
```
---
## Quick Reference
| Scenario | Quoting |
|---|---|
| Static URL, no variables | Single quotes: `'https://...'` |
| URL with shell variable | Double quotes: `"https://...${VAR}"` |
| Output path with spaces | Single or double quotes around `-O` path |
| URL in a script variable | Assign with double quotes: `URL="https://..."`, then `wget "$URL"` |
---
## Common Symptoms of Unquoted URLs
- `bash: <partial-url>: command not found``&` split the command
- Download completes instantly with a tiny file (error page, not the real content)
- `wget` reports success but the file is corrupt or truncated
- `No such file or directory` errors on URL fragments
- History expansion errors (`!` in URL triggers `bash: !...: event not found`)
---
## See Also
- [Bash Scripting Patterns](../01-linux/shell-scripting/bash-scripting-patterns.md) — general shell quoting and safety patterns

View File

@@ -1,3 +1,12 @@
---
title: "yt-dlp YouTube JS Challenge Fix (Fedora)"
domain: troubleshooting
category: general
tags: [yt-dlp, fedora, youtube, javascript, deno]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# yt-dlp YouTube JS Challenge Fix (Fedora)
## Problem
@@ -78,10 +87,6 @@ sudo pip install -U yt-dlp --break-system-packages
---
## Tags
#yt-dlp #fedora #youtube #plex #self-hosted
## Known Limitations
### n-Challenge Failure: "found 0 n function possibilities"

View File

@@ -2,7 +2,8 @@
title: MajorWiki Deployment Status
status: deployed
project: MajorTwin
updated: '2026-03-12'
updated: 2026-04-07T10:48
created: 2026-04-02T16:10
---
# MajorWiki Deployment Status
@@ -31,8 +32,8 @@ DNS record and Caddy entry have been removed.
## Content
- 37 articles across 5 domains
- Source of truth: `MajorVault/20-Projects/MajorTwin/08-Wiki/`
- 74 articles across 5 domains
- Source of truth: `MajorVault/30-Areas/MajorWiki/`
- Deployed via Gitea webhook (push from MajorAir → auto-pull on majorlab)
## Update Workflow
@@ -40,7 +41,7 @@ DNS record and Caddy entry have been removed.
```bash
# From MajorRig (majorlinux user)
rsync -av --include="*.md" --include="*/" --exclude="*" \
/mnt/c/Users/majli/Documents/MajorVault/20-Projects/MajorTwin/08-Wiki/ \
/mnt/c/Users/majli/Documents/MajorVault/30-Areas/MajorWiki/ \
root@majorlab:/opt/majwiki/docs/
# MkDocs hot-reloads automatically — no container restart needed
@@ -63,7 +64,7 @@ rsync -av --include="*.md" --include="*/" --exclude="*" \
---
*Updated 2026-03-14*
*Updated 2026-03-15*
## Canonical Update Workflow
@@ -71,7 +72,7 @@ Obsidian Git plugin was evaluated and dropped — too convoluted. Manual git fro
```bash
cd ~/Documents/MajorVault
git add 20-Projects/MajorTwin/08-Wiki/
git add 30-Areas/MajorWiki/
git commit -m "wiki: describe your changes"
git push
```
@@ -102,3 +103,57 @@ Every time a new article is added, the following **MUST** be updated to maintain
- [[MajorRig|MajorRig]] — alternative git push host (WSL2 path documented)
- [[03-11-2026|Status Update 2026-03-11]] — deployment date journal entry
- [[03-13-2026|Status Update 2026-03-13]] — content expansion and SUMMARY.md sync
---
## Session Update — 2026-03-16
**Article count:** 45 (was 42)
**New articles added:**
- `01-linux/distro-specific/wsl2-rebuild-fedora43-training-env.md` — full MajorTwin training env rebuild guide
- `01-linux/distro-specific/wsl2-backup-powershell.md` — WSL2 backup via PowerShell scheduled task
- `02-selfhosting/security/ansible-unattended-upgrades-fleet.md` — standardizing unattended-upgrades across Ubuntu fleet
**SUMMARY.md:** Updated to include all 3 new articles. Run SUMMARY.md dedup script if duplicate content appears (see board file cleanup pattern).
**Updated:** `updated: 2026-03-16`
## Session Update — 2026-03-17
**Article count:** 47 (was 45)
**New articles added:**
- `05-troubleshooting/networking/windows-sshd-stops-after-reboot.md` — Windows OpenSSH sshd not starting after reboot
- `05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md` — Ollama drops off Tailscale when MajorMac sleeps
**Updated:** `updated: 2026-03-17`
## Session Update — 2026-03-18 (morning)
**Article count:** 48 (was 47)
**New articles added:**
- `02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md` — tuning docker_container_unhealthy alarm to prevent flapping during Nextcloud AIO updates
**Updated:** `updated: 2026-03-18`
## Session Update — 2026-03-18 (afternoon)
**Article count:** 49 (was 48)
**New articles added:**
- `02-selfhosting/monitoring/netdata-new-server-setup.md` — full Netdata deployment guide: install via kickstart.sh, email notification config, Netdata Cloud claim
**Updated:** `updated: 2026-03-18`
## Session Update — 2026-04-02
**Article count:** 74 (was 49)
**New article this session:**
- `02-selfhosting/security/fail2ban-wordpress-login-jail.md` — Fail2ban custom jail for WordPress login brute force (access-log-based, no plugin required)
**Also today:** Major wiki audit added 8 articles from archive, fixed 67 wikilinks, added frontmatter to 43 files, moved wiki from `20-Projects/MajorTwin/08-Wiki/` to `30-Areas/MajorWiki/`. See daily note for full details.
**Updated:** `updated: 2026-04-02`

View File

@@ -1,29 +0,0 @@
# 🌐 Network Overview
The **[[MajorInfrastructure|MajorsHouse]]** infrastructure is connected via a private **[[Network Overview#Tailscale|Tailscale]]** mesh network. This allows secure, peer-to-peer communication between devices across different geographic locations (US and UK) without exposing services to the public internet.
## 🏛️ Infrastructure Summary
- **Address Space:** 100.x.x.x (Tailscale CGNAT)
- **Management:** Centralized via **[[Network Overview#Ansible|Ansible]]** (`MajorAnsible` repo)
- **Host Groupings:** Functional (web, mail, homelab, bots), OS (Fedora, Ubuntu), and Location (US, UK).
## 🌍 Geographic Nodes
| Host | Location | IP | OS |
|---|---|---|---|
| `[[dca|dca]]` | 🇺🇸 US | 100.104.11.146 | Ubuntu 24.04 |
| `[[majortoot|majortoot]]` | 🇺🇸 US | 100.110.197.17 | Ubuntu 24.04 |
| `[[majorhome|majorhome]]` | 🇺🇸 US | 100.120.209.106 | Fedora 43 |
| `[[teelia|teelia]]` | 🇬🇧 UK | 100.120.32.69 | Ubuntu 24.04 |
## 🔗 Tailscale Setup
Tailscale is configured as a persistent service on all nodes. Key features used include:
- **Tailscale SSH:** Enabled for secure management via Ansible.
- **MagicDNS:** Used for internal hostname resolution (e.g., `majorlab.tailscale.net`).
- **ACLs:** Managed via the Tailscale admin console to restrict cross-group communication where necessary.
---
*Last updated: 2026-03-04*

View File

@@ -1,19 +1,23 @@
---
created: 2026-04-06T09:52
updated: 2026-04-07T21:59
---
# MajorLinux Tech Wiki — Index
> A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin.
>
**Last updated:** 2026-03-14
**Article count:** 37
**Last updated:** 2026-04-08
**Article count:** 74
## Domains
| Domain | Folder | Articles |
|---|---|---|
| 🐧 Linux & Sysadmin | `01-linux/` | 9 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 8 |
| 🔓 Open Source Tools | `03-opensource/` | 9 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 1 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 10 |
| 🐧 Linux & Sysadmin | `01-linux/` | 12 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 21 |
| 🔓 Open Source Tools | `03-opensource/` | 10 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 2 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 29 |
---
@@ -26,7 +30,7 @@
- [Managing Linux Services with systemd](01-linux/process-management/managing-linux-services-systemd-ansible.md) — systemctl, journalctl, writing service files, Ansible service management
### Networking
- [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) — key generation, ssh-copy-id, ~/.ssh/config, managing multiple keys
- [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) — key generation, ssh-copy-id, ~/.ssh/config, managing multiple keys, Windows OpenSSH admin key auth
### Package Management
- [Package Management Reference](01-linux/packages/package-management-reference.md) — apt, dnf, pacman side-by-side reference, Flatpak/Snap
@@ -37,10 +41,13 @@
### Storage
- [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md) — Pooling mismatched drives and adding parity on Linux
- [mdadm — Rebuilding a RAID Array After Reinstall](01-linux/storage/mdadm-raid-rebuild.md) — reassembling and recovering mdadm arrays after OS reinstall
### Distro-Specific
- [Linux Distro Guide for Beginners](01-linux/distro-specific/linux-distro-guide-beginners.md) — Ubuntu recommendation, distro comparison, desktop environments
- [WSL2 Instance Migration to Fedora 43](01-linux/distro-specific/wsl2-instance-migration-fedora43.md) — moving WSL2 VHDX from C: to another drive
- [WSL2 Training Environment Rebuild (Fedora 43)](01-linux/distro-specific/wsl2-rebuild-fedora43-training-env.md) — rebuilding the MajorTwin training env in WSL2 from scratch
- [WSL2 Backup via PowerShell Scheduled Task](01-linux/distro-specific/wsl2-backup-powershell.md) — automating WSL2 exports on a schedule using PowerShell
---
@@ -50,21 +57,36 @@
- [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md) — hardware options, Docker install, first services, networking basics
- [Docker vs VMs for the Homelab](02-selfhosting/docker/docker-vs-vms-homelab.md) — when to use containers vs VMs, KVM setup, how to run both
- [Debugging Broken Docker Containers](02-selfhosting/docker/debugging-broken-docker-containers.md) — logs, inspect, exec, port conflicts, permission errors
- [Docker Healthchecks](02-selfhosting/docker/docker-healthchecks.md) — writing and debugging HEALTHCHECK instructions in Docker containers
### Reverse Proxies
- [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md) — Caddyfile basics, automatic HTTPS, local TLS, DNS challenge
### DNS & Networking
- [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md) — installation, MagicDNS, making services accessible, subnet router, ACLs
- [Network Overview](02-selfhosting/dns-networking/network-overview.md) — MajorsHouse network topology, Tailscale IPs, and connectivity map
### Storage & Backup
- [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) — flags reference, remote backup, incremental with hard links, cron/systemd
- [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) — flags reference, remote backup, incremental with hard links, Glacier Deep Archive
### Monitoring
- [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) — tuning web_log_1m_redirects threshold for HTTPS-forcing servers
- [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) — preventing false alerts during nightly Nextcloud AIO container update cycles
- [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) — install, email notifications, and Netdata Cloud claim for Ubuntu/Debian servers
- [Netdata + n8n Enriched Alert Emails](02-selfhosting/monitoring/netdata-n8n-enriched-alerts.md) — rich HTML alert emails with remediation steps and wiki links via n8n
- [Netdata SELinux AVC Denial Monitoring](02-selfhosting/monitoring/netdata-selinux-avc-chart.md) — custom Netdata chart for tracking SELinux AVC denials
### Security
- [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban
- [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban, SpamAssassin
- [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md) — fleet-wide automatic security updates across Ubuntu servers
- [Fail2ban Custom Jail: Apache 404 Scanner Detection](02-selfhosting/security/fail2ban-apache-404-scanner-jail.md) — custom filter and jail for blocking 404 scanners
- [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md) — access-log-based wp-login.php brute force detection without plugins
- [SELinux: Fixing Fail2ban grep execmem Denial](02-selfhosting/security/selinux-fail2ban-execmem-fix.md) — resolving execmem AVC denials from Fail2ban's grep on Fedora
- [UFW Firewall Management](02-selfhosting/security/ufw-firewall-management.md) — managing UFW rules, common patterns, troubleshooting
### Services
- [Updating n8n Running in Docker](02-selfhosting/services/updating-n8n-docker.md) — pinned version updates, password reset, Arcane timing gaps
- [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md) — character limit increase, media cache management for self-hosted Mastodon
---
@@ -82,6 +104,7 @@
- [tmux: Persistent Terminal Sessions](03-opensource/dev-tools/tmux.md) — detachable sessions for long-running jobs over SSH
- [screen: Simple Persistent Sessions](03-opensource/dev-tools/screen.md) — lightweight terminal multiplexer, universally available
- [rsync: Fast, Resumable File Transfers](03-opensource/dev-tools/rsync.md) — incremental file sync locally and over SSH, survives interruptions
- [Ventoy: Multi-Boot USB Tool](03-opensource/dev-tools/ventoy.md) — drop ISOs on a USB drive and boot any of them, no reflashing
### Privacy & Security
- [Vaultwarden: Self-Hosted Password Manager](03-opensource/privacy-security/vaultwarden.md) — Bitwarden-compatible server in a single Docker container, passwords stay on your hardware
@@ -96,19 +119,40 @@
### OBS Studio
- [OBS Studio Setup & Encoding](04-streaming/obs/obs-studio-setup-encoding.md) — installation, NVENC/x264 settings, scene setup, audio filters, Linux Wayland notes
### Plex
- [Plex 4K Codec Compatibility (Apple TV)](04-streaming/plex/plex-4k-codec-compatibility.md) — AV1/VP9 vs HEVC, batch conversion script, yt-dlp auto-convert hook
---
## 🔧 General Troubleshooting
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md) — diagnosing and fixing Apache outages caused by missing firewall rules and Fail2ban self-bans
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md) — diagnosing why one device stops receiving email when the mail server is healthy
- [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md) — recovering IMAP and webmail after firewalld reload drops all mail service rules
- [Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS](05-troubleshooting/networking/fail2ban-ufw-rule-bloat-cleanup.md) — diagnosing and cleaning up massive nftables/UFW rule accumulation
- [Tailscale SSH: Unexpected Re-Authentication Prompt](05-troubleshooting/networking/tailscale-ssh-reauth-prompt.md) — resolving unexpected re-auth prompts on Tailscale SSH connections
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) — fixing docker.socket, SELinux port blocks, and httpd_can_network_connect after reboot
- [n8n Behind Reverse Proxy: X-Forwarded-For Trust Fix](05-troubleshooting/docker/n8n-proxy-trust-x-forwarded-for.md) — fixing webhook failures caused by missing proxy trust configuration
- [Nextcloud AIO Container Unhealthy for 20 Hours](05-troubleshooting/docker/nextcloud-aio-unhealthy-20h-stuck.md) — diagnosing stuck Nextcloud AIO containers after nightly update cycles
- [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md) — troubleshooting why wiki.majorshouse.com was blocked by Google Fiber
- [Obsidian Cache Hang Recovery](05-troubleshooting/obsidian-cache-hang-recovery.md) — resolving "Loading cache" hang in Obsidian by cleaning Electron app data and ML artifacts
- [macOS Repeating Alert Tone from Mirrored Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md) — stopping alert tone loops from mirrored iPhone notifications on Mac
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) — fixes and alternatives when hitting VRAM limits during fine-tuning
- [yt-dlp YouTube JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md) — fixing YouTube JS challenge solver errors and missing formats on Fedora
- [Gemini CLI Manual Update](05-troubleshooting/gemini-cli-manual-update.md) — how to manually update the Gemini CLI when automatic updates fail
- [MajorWiki Setup & Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) — setting up MajorWiki and the Obsidian → Gitea → MkDocs publishing pipeline
- [Gitea Actions Runner: Boot Race Condition Fix](05-troubleshooting/gitea-runner-boot-race-network-target.md) — fixing act_runner crash loop on boot caused by DNS not ready at startup
- [SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)](05-troubleshooting/selinux-dovecot-vmail-context.md) — fixing thousands of AVC denials when /var/vmail has wrong SELinux context
- [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) — diagnosing and recovering a failed mdadm array caused by a USB hub dropout
- [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) — fixing sshd not running after reboot due to Manual startup type
- [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) — keeping Ollama reachable over Tailscale by disabling macOS sleep on AC power
- [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md) — fixing the missing vault_pass file error when running ansible-playbook
- [Ansible SSH Timeout During dnf upgrade](05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md) — preventing SSH timeouts during long-running dnf upgrades on Fedora
- [Fedora Networking & Kernel Troubleshooting](05-troubleshooting/fedora-networking-kernel-recovery.md) — nmcli quick fix, GRUB kernel rollback, and recovery for Fedora fleet
- [Custom Fail2ban Jail: Apache Directory Scanning](05-troubleshooting/security/apache-dirscan-fail2ban-jail.md) — blocking directory scanners and junk HTTP methods
- [ClamAV Safe Scheduling on Live Servers](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md) — preventing clamscan CPU spikes with nice and ionice
- [Systemd Session Scope Fails at Login](05-troubleshooting/systemd/session-scope-failure-at-login.md) — fixing session-cN.scope failures during login
- [wget/curl: URLs with Special Characters Fail in Bash](05-troubleshooting/wget-url-special-characters.md) — fixing broken downloads caused by unquoted URLs with &, ?, # characters
---
@@ -116,6 +160,28 @@
| Date | Article | Domain |
|---|---|---|
| 2026-04-08 | [wget/curl: URLs with Special Characters Fail in Bash](05-troubleshooting/wget-url-special-characters.md) | Troubleshooting |
| 2026-04-07 | [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) | Linux |
| 2026-04-07 | [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](05-troubleshooting/networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md) | Troubleshooting |
| 2026-04-07 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting |
| 2026-04-02 | [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md) | Self-Hosting |
| 2026-04-02 | [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md) | Self-Hosting |
| 2026-04-02 | [mdadm — Rebuilding a RAID Array After Reinstall](01-linux/storage/mdadm-raid-rebuild.md) | Linux |
| 2026-04-02 | [Fedora Networking & Kernel Troubleshooting](05-troubleshooting/fedora-networking-kernel-recovery.md) | Troubleshooting |
| 2026-04-02 | [Ventoy: Multi-Boot USB Tool](03-opensource/dev-tools/ventoy.md) | Open Source |
| 2026-03-18 | [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) | Self-Hosting |
| 2026-03-18 | [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) | Self-Hosting |
| 2026-03-17 | [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) | Troubleshooting |
| 2026-03-17 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting |
| 2026-03-16 | [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md) | Self-Hosting |
| 2026-03-16 | [WSL2 Training Environment Rebuild (Fedora 43)](01-linux/distro-specific/wsl2-rebuild-fedora43-training-env.md) | Linux |
| 2026-03-16 | [WSL2 Backup via PowerShell Scheduled Task](01-linux/distro-specific/wsl2-backup-powershell.md) | Linux |
| 2026-03-15 | [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md) | Troubleshooting |
| 2026-03-15 | [Plex 4K Codec Compatibility (Apple TV)](04-streaming/plex/plex-4k-codec-compatibility.md) | Streaming |
| 2026-03-15 | [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) | Troubleshooting |
| 2026-03-15 | [yt-dlp: Video Downloading](03-opensource/media-creative/yt-dlp.md) | Open Source |
| 2026-03-14 | [SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)](05-troubleshooting/selinux-dovecot-vmail-context.md) | Troubleshooting |
| 2026-03-14 | [Gitea Actions Runner: Boot Race Condition Fix](05-troubleshooting/gitea-runner-boot-race-network-target.md) | Troubleshooting |
| 2026-03-14 | [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md) | Troubleshooting |
| 2026-03-14 | [SearXNG: Private Self-Hosted Search](03-opensource/alternatives/searxng.md) | Open Source |
| 2026-03-14 | [FreshRSS: Self-Hosted RSS Reader](03-opensource/alternatives/freshrss.md) | Open Source |
@@ -142,6 +208,4 @@
---
## Related
- [[MajorWiki-Deploy-Status|MajorWiki Deploy Status]] — deployment status and git workflow
- [[01-Phases|Implementation Phases]] — Phase 9 (wiki & knowledge base)
- [[majorlab|majorlab]] — hosting server
- [MajorWiki Deploy Status](MajorWiki-Deploy-Status.md) — deployment status and update workflow

View File

@@ -1,3 +1,11 @@
---
created: 2026-04-02T16:03
<<<<<<< HEAD
updated: 2026-04-07T10:48
=======
updated: 2026-04-08
>>>>>>> 4dc77d4 (Add troubleshooting article: wget/curl URLs with special characters)
---
* [Home](index.md)
* [Linux & Sysadmin](01-linux/index.md)
* [Linux File Permissions](01-linux/files-permissions/linux-file-permissions.md)
@@ -7,17 +15,33 @@
* [Ansible Getting Started](01-linux/shell-scripting/ansible-getting-started.md)
* [Bash Scripting Patterns](01-linux/shell-scripting/bash-scripting-patterns.md)
* [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md)
* [mdadm — Rebuilding a RAID Array After Reinstall](01-linux/storage/mdadm-raid-rebuild.md)
* [Linux Distro Guide for Beginners](01-linux/distro-specific/linux-distro-guide-beginners.md)
* [WSL2 Instance Migration to Fedora 43](01-linux/distro-specific/wsl2-instance-migration-fedora43.md)
* [WSL2 Training Environment Rebuild](01-linux/distro-specific/wsl2-rebuild-fedora43-training-env.md)
* [WSL2 Backup via PowerShell](01-linux/distro-specific/wsl2-backup-powershell.md)
* [Self-Hosting & Homelab](02-selfhosting/index.md)
* [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md)
* [Docker vs VMs for the Homelab](02-selfhosting/docker/docker-vs-vms-homelab.md)
* [Debugging Broken Docker Containers](02-selfhosting/docker/debugging-broken-docker-containers.md)
* [Docker Healthchecks](02-selfhosting/docker/docker-healthchecks.md)
* [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md)
* [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md)
* [Network Overview](02-selfhosting/dns-networking/network-overview.md)
* [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md)
* [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md)
* [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md)
* [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md)
* [Netdata SELinux AVC Denial Monitoring](02-selfhosting/monitoring/netdata-selinux-avc-chart.md)
* [Netdata n8n Enriched Alert Emails](02-selfhosting/monitoring/netdata-n8n-enriched-alerts.md)
* [Updating n8n Running in Docker](02-selfhosting/services/updating-n8n-docker.md)
* [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md)
* [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md)
* [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md)
* [Fail2ban Custom Jail: Apache 404 Scanner Detection](02-selfhosting/security/fail2ban-apache-404-scanner-jail.md)
* [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md)
* [SELinux: Fixing Fail2ban grep execmem Denial](02-selfhosting/security/selinux-fail2ban-execmem-fix.md)
* [UFW Firewall Management](02-selfhosting/security/ufw-firewall-management.md)
* [Open Source & Alternatives](03-opensource/index.md)
* [SearXNG: Private Self-Hosted Search](03-opensource/alternatives/searxng.md)
* [FreshRSS: Self-Hosted RSS Reader](03-opensource/alternatives/freshrss.md)
@@ -26,13 +50,21 @@
* [tmux: Persistent Terminal Sessions](03-opensource/dev-tools/tmux.md)
* [screen: Simple Persistent Sessions](03-opensource/dev-tools/screen.md)
* [rsync: Fast, Resumable File Transfers](03-opensource/dev-tools/rsync.md)
* [Ventoy: Multi-Boot USB Tool](03-opensource/dev-tools/ventoy.md)
* [Vaultwarden: Self-Hosted Password Manager](03-opensource/privacy-security/vaultwarden.md)
* [yt-dlp: Video Downloading](03-opensource/media-creative/yt-dlp.md)
* [Streaming & Podcasting](04-streaming/index.md)
* [OBS Studio Setup & Encoding](04-streaming/obs/obs-studio-setup-encoding.md)
* [Plex 4K Codec Compatibility (Apple TV)](04-streaming/plex/plex-4k-codec-compatibility.md)
* [Troubleshooting](05-troubleshooting/index.md)
* [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md)
* [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md)
* [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md)
* [Tailscale SSH: Unexpected Re-Authentication Prompt](05-troubleshooting/networking/tailscale-ssh-reauth-prompt.md)
* [Fail2ban & UFW Rule Bloat Cleanup](05-troubleshooting/networking/fail2ban-ufw-rule-bloat-cleanup.md)
* [Custom Fail2ban Jail: Apache Directory Scanning](05-troubleshooting/security/apache-dirscan-fail2ban-jail.md)
* [Nextcloud AIO Unhealthy 20h After Nightly Update](05-troubleshooting/docker/nextcloud-aio-unhealthy-20h-stuck.md)
* [n8n Behind Reverse Proxy: X-Forwarded-For Trust Fix](05-troubleshooting/docker/n8n-proxy-trust-x-forwarded-for.md)
* [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md)
* [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md)
* [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md)
@@ -40,3 +72,17 @@
* [yt-dlp YouTube JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md)
* [Gemini CLI Manual Update](05-troubleshooting/gemini-cli-manual-update.md)
* [MajorWiki Setup & Publishing Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md)
* [Gitea Actions Runner: Boot Race Condition Fix](05-troubleshooting/gitea-runner-boot-race-network-target.md)
* [SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)](05-troubleshooting/selinux-dovecot-vmail-context.md)
* [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md)
* [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md)
* [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](05-troubleshooting/networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md)
* [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md)
* [macOS: Repeating Alert Tone from Mirrored iPhone Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md)
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)
* [Ansible: ansible.cfg Ignored on WSL2 Windows Mounts](05-troubleshooting/ansible-wsl2-world-writable-mount-ignores-cfg.md)
* [Ansible: SSH Timeout During dnf upgrade on Fedora Hosts](05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md)
* [Fedora Networking & Kernel Troubleshooting](05-troubleshooting/fedora-networking-kernel-recovery.md)
* [Systemd Session Scope Fails at Login](05-troubleshooting/systemd/session-scope-failure-at-login.md)
* [wget/curl: URLs with Special Characters Fail in Bash](05-troubleshooting/wget-url-special-characters.md)

View File

@@ -1,19 +1,24 @@
---
created: 2026-04-06T09:52
updated: 2026-04-07T21:59
---
# MajorLinux Tech Wiki — Index
> A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin.
>
> **Last updated:** 2026-03-14
> **Article count:** 37
> **Last updated:** 2026-04-08
> **Article count:** 74
## Domains
| Domain | Folder | Articles |
|---|---|---|
| 🐧 Linux & Sysadmin | `01-linux/` | 9 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 8 |
| 🔓 Open Source Tools | `03-opensource/` | 9 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 1 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 10 |
| 🐧 Linux & Sysadmin | `01-linux/` | 12 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 21 |
| 🔓 Open Source Tools | `03-opensource/` | 10 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 2 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 29 |
---
@@ -26,7 +31,7 @@
- [Managing Linux Services with systemd](01-linux/process-management/managing-linux-services-systemd-ansible.md) — systemctl, journalctl, writing service files, Ansible service management
### Networking
- [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) — key generation, ssh-copy-id, ~/.ssh/config, managing multiple keys
- [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) — key generation, ssh-copy-id, ~/.ssh/config, managing multiple keys, Windows OpenSSH admin key auth
### Package Management
- [Package Management Reference](01-linux/packages/package-management-reference.md) — apt, dnf, pacman side-by-side reference, Flatpak/Snap
@@ -37,10 +42,13 @@
### Storage
- [SnapRAID & MergerFS Storage Setup](01-linux/storage/snapraid-mergerfs-setup.md) — Pooling mismatched drives and adding parity on Linux
- [mdadm — Rebuilding a RAID Array After Reinstall](01-linux/storage/mdadm-raid-rebuild.md) — reassembling and recovering mdadm arrays after OS reinstall
### Distro-Specific
- [Linux Distro Guide for Beginners](01-linux/distro-specific/linux-distro-guide-beginners.md) — Ubuntu recommendation, distro comparison, desktop environments
- [WSL2 Instance Migration to Fedora 43](01-linux/distro-specific/wsl2-instance-migration-fedora43.md) — moving WSL2 VHDX from C: to another drive
- [WSL2 Training Environment Rebuild (Fedora 43)](01-linux/distro-specific/wsl2-rebuild-fedora43-training-env.md) — rebuilding the MajorTwin training env in WSL2 from scratch
- [WSL2 Backup via PowerShell Scheduled Task](01-linux/distro-specific/wsl2-backup-powershell.md) — automating WSL2 exports on a schedule using PowerShell
---
@@ -50,21 +58,36 @@
- [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md) — hardware options, Docker install, first services, networking basics
- [Docker vs VMs for the Homelab](02-selfhosting/docker/docker-vs-vms-homelab.md) — when to use containers vs VMs, KVM setup, how to run both
- [Debugging Broken Docker Containers](02-selfhosting/docker/debugging-broken-docker-containers.md) — logs, inspect, exec, port conflicts, permission errors
- [Docker Healthchecks](02-selfhosting/docker/docker-healthchecks.md) — writing and debugging HEALTHCHECK instructions in Docker containers
### Reverse Proxies
- [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md) — Caddyfile basics, automatic HTTPS, local TLS, DNS challenge
### DNS & Networking
- [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md) — installation, MagicDNS, making services accessible, subnet router, ACLs
- [Network Overview](02-selfhosting/dns-networking/network-overview.md) — MajorsHouse network topology, Tailscale IPs, and connectivity map
### Storage & Backup
- [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) — flags reference, remote backup, incremental with hard links, cron/systemd
### Monitoring
- [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) — tuning web_log_1m_redirects threshold for HTTPS-forcing servers
- [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) — preventing false alerts during nightly Nextcloud AIO container update cycles
- [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) — install, email notifications, and Netdata Cloud claim for Ubuntu/Debian servers
- [Netdata + n8n Enriched Alert Emails](02-selfhosting/monitoring/netdata-n8n-enriched-alerts.md) — rich HTML alert emails with remediation steps and wiki links via n8n
- [Netdata SELinux AVC Denial Monitoring](02-selfhosting/monitoring/netdata-selinux-avc-chart.md) — custom Netdata chart for tracking SELinux AVC denials
### Security
- [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban
- [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban, SpamAssassin
- [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md) — fleet-wide automatic security updates across Ubuntu servers
- [Fail2ban Custom Jail: Apache 404 Scanner Detection](02-selfhosting/security/fail2ban-apache-404-scanner-jail.md) — custom filter and jail for blocking 404 scanners
- [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md) — access-log-based wp-login.php brute force detection without plugins
- [SELinux: Fixing Fail2ban grep execmem Denial](02-selfhosting/security/selinux-fail2ban-execmem-fix.md) — resolving execmem AVC denials from Fail2ban's grep on Fedora
- [UFW Firewall Management](02-selfhosting/security/ufw-firewall-management.md) — managing UFW rules, common patterns, troubleshooting
### Services
- [Updating n8n Running in Docker](02-selfhosting/services/updating-n8n-docker.md) — pinned version updates, password reset, Arcane timing gaps
- [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md) — character limit increase, media cache management for self-hosted Mastodon
---
@@ -82,6 +105,7 @@
- [tmux: Persistent Terminal Sessions](03-opensource/dev-tools/tmux.md) — detachable sessions for long-running jobs over SSH
- [screen: Simple Persistent Sessions](03-opensource/dev-tools/screen.md) — lightweight terminal multiplexer, universally available
- [rsync: Fast, Resumable File Transfers](03-opensource/dev-tools/rsync.md) — incremental file sync locally and over SSH, survives interruptions
- [Ventoy: Multi-Boot USB Tool](03-opensource/dev-tools/ventoy.md) — drop ISOs on a USB drive and boot any of them, no reflashing
### Privacy & Security
- [Vaultwarden: Self-Hosted Password Manager](03-opensource/privacy-security/vaultwarden.md) — Bitwarden-compatible server in a single Docker container, passwords stay on your hardware
@@ -96,19 +120,43 @@
### OBS Studio
- [OBS Studio Setup & Encoding](04-streaming/obs/obs-studio-setup-encoding.md) — installation, NVENC/x264 settings, scene setup, audio filters, Linux Wayland notes
### Plex
- [Plex 4K Codec Compatibility (Apple TV)](04-streaming/plex/plex-4k-codec-compatibility.md) — AV1/VP9 vs HEVC, batch conversion script, yt-dlp auto-convert hook
---
## 🔧 General Troubleshooting
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md) — diagnosing and fixing Apache outages caused by missing firewall rules and Fail2ban self-bans
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md) — diagnosing why one device stops receiving email when the mail server is healthy
- [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md) — recovering IMAP and webmail after firewalld reload drops all mail service rules
- [Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS](05-troubleshooting/networking/fail2ban-ufw-rule-bloat-cleanup.md) — diagnosing and cleaning up massive nftables/UFW rule accumulation
- [Tailscale SSH: Unexpected Re-Authentication Prompt](05-troubleshooting/networking/tailscale-ssh-reauth-prompt.md) — resolving unexpected re-auth prompts on Tailscale SSH connections
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) — fixing docker.socket, SELinux port blocks, and httpd_can_network_connect after reboot
- [n8n Behind Reverse Proxy: X-Forwarded-For Trust Fix](05-troubleshooting/docker/n8n-proxy-trust-x-forwarded-for.md) — fixing webhook failures caused by missing proxy trust configuration
- [Nextcloud AIO Container Unhealthy for 20 Hours](05-troubleshooting/docker/nextcloud-aio-unhealthy-20h-stuck.md) — diagnosing stuck Nextcloud AIO containers after nightly update cycles
- [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md) — troubleshooting why wiki.majorshouse.com was blocked by Google Fiber
- [Obsidian Cache Hang Recovery](05-troubleshooting/obsidian-cache-hang-recovery.md) — resolving "Loading cache" hang in Obsidian by cleaning Electron app data and ML artifacts
- [macOS Repeating Alert Tone from Mirrored Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md) — stopping alert tone loops from mirrored iPhone notifications on Mac
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md) — fixes and alternatives when hitting VRAM limits during fine-tuning
- [yt-dlp YouTube JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md) — fixing YouTube JS challenge solver errors and missing formats on Fedora
- [Gemini CLI Manual Update](05-troubleshooting/gemini-cli-manual-update.md) — how to manually update the Gemini CLI when automatic updates fail
- [MajorWiki Setup & Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md) — setting up MajorWiki and the Obsidian → Gitea → MkDocs publishing pipeline
- [Gitea Actions Runner: Boot Race Condition Fix](05-troubleshooting/gitea-runner-boot-race-network-target.md) — fixing act_runner crash loop on boot caused by DNS not ready at startup
- [SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)](05-troubleshooting/selinux-dovecot-vmail-context.md) — fixing thousands of AVC denials when /var/vmail has wrong SELinux context
- [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) — diagnosing and recovering a failed mdadm array caused by a USB hub dropout
- [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) — fixing sshd not running after reboot due to Manual startup type
- [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](05-troubleshooting/networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md) — fixing remote SSH command failures when wsl.exe is the default shell
- [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) — keeping Ollama reachable over Tailscale by disabling macOS sleep on AC power
- [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md) — fixing the missing vault_pass file error when running ansible-playbook
- [Ansible: ansible.cfg Ignored on WSL2 Windows Mounts](05-troubleshooting/ansible-wsl2-world-writable-mount-ignores-cfg.md) — fixing silent config ignore due to world-writable /mnt/d/ permissions
- [Ansible SSH Timeout During dnf upgrade](05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md) — preventing SSH timeouts during long-running dnf upgrades on Fedora
- [Fedora Networking & Kernel Troubleshooting](05-troubleshooting/fedora-networking-kernel-recovery.md) — nmcli quick fix, GRUB kernel rollback, and recovery for Fedora fleet
- [Custom Fail2ban Jail: Apache Directory Scanning](05-troubleshooting/security/apache-dirscan-fail2ban-jail.md) — blocking directory scanners and junk HTTP methods
- [ClamAV Safe Scheduling on Live Servers](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md) — preventing clamscan CPU spikes with nice and ionice
- [Systemd Session Scope Fails at Login](05-troubleshooting/systemd/session-scope-failure-at-login.md) — fixing session-cN.scope failures during login
- [wget/curl: URLs with Special Characters Fail in Bash](05-troubleshooting/wget-url-special-characters.md) — fixing broken downloads caused by unquoted URLs with &, ?, # characters
---
@@ -116,6 +164,34 @@
| Date | Article | Domain |
|---|---|---|
| 2026-04-08 | [wget/curl: URLs with Special Characters Fail in Bash](05-troubleshooting/wget-url-special-characters.md) | Troubleshooting |
| 2026-04-07 | [SSH Config & Key Management](01-linux/networking/ssh-config-key-management.md) | Linux |
| 2026-04-07 | [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](05-troubleshooting/networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md) | Troubleshooting |
| 2026-04-07 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting |
| 2026-04-03 | [Ansible: ansible.cfg Ignored on WSL2 Windows Mounts](05-troubleshooting/ansible-wsl2-world-writable-mount-ignores-cfg.md) | Troubleshooting |
| 2026-04-02 | [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md) | Self-Hosting |
| 2026-04-02 | [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md) | Self-Hosting |
| 2026-04-02 | [mdadm — Rebuilding a RAID Array After Reinstall](01-linux/storage/mdadm-raid-rebuild.md) | Linux |
| 2026-04-02 | [Fedora Networking & Kernel Troubleshooting](05-troubleshooting/fedora-networking-kernel-recovery.md) | Troubleshooting |
| 2026-04-02 | [Ventoy: Multi-Boot USB Tool](03-opensource/dev-tools/ventoy.md) | Open Source |
| 2026-04-02 | [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) (updated — Glacier Deep Archive) | Self-Hosting |
| 2026-04-02 | [yt-dlp: Video Downloading](03-opensource/media-creative/yt-dlp.md) (updated — subtitles, temp fix) | Open Source |
| 2026-04-02 | [OBS Studio Setup & Encoding](04-streaming/obs/obs-studio-setup-encoding.md) (updated — captions plugin, VLC capture) | Streaming |
| 2026-04-02 | [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) (updated — SpamAssassin) | Self-Hosting |
| 2026-03-23 | [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md) | Troubleshooting |
| 2026-03-18 | [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) | Self-Hosting |
| 2026-03-18 | [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) | Self-Hosting |
| 2026-03-17 | [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) | Troubleshooting |
| 2026-03-17 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting |
| 2026-03-16 | [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md) | Self-Hosting |
| 2026-03-16 | [WSL2 Training Environment Rebuild (Fedora 43)](01-linux/distro-specific/wsl2-rebuild-fedora43-training-env.md) | Linux |
| 2026-03-16 | [WSL2 Backup via PowerShell Scheduled Task](01-linux/distro-specific/wsl2-backup-powershell.md) | Linux |
| 2026-03-15 | [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md) | Troubleshooting |
| 2026-03-15 | [Plex 4K Codec Compatibility (Apple TV)](04-streaming/plex/plex-4k-codec-compatibility.md) | Streaming |
| 2026-03-15 | [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) | Troubleshooting |
| 2026-03-15 | [yt-dlp: Video Downloading](03-opensource/media-creative/yt-dlp.md) | Open Source |
| 2026-03-14 | [SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)](05-troubleshooting/selinux-dovecot-vmail-context.md) | Troubleshooting |
| 2026-03-14 | [Gitea Actions Runner: Boot Race Condition Fix](05-troubleshooting/gitea-runner-boot-race-network-target.md) | Troubleshooting |
| 2026-03-14 | [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md) | Troubleshooting |
| 2026-03-14 | [SearXNG: Private Self-Hosted Search](03-opensource/alternatives/searxng.md) | Open Source |
| 2026-03-14 | [FreshRSS: Self-Hosted RSS Reader](03-opensource/alternatives/freshrss.md) | Open Source |
@@ -141,6 +217,4 @@
---
## Related
- [[MajorWiki-Deploy-Status|MajorWiki Deploy Status]] — deployment status and update workflow
- [[01-Phases|Implementation Phases]] — Phase 9 (wiki & knowledge base)
- [[majorlab|majorlab]] — hosting server (notes.majorshouse.com)
- [MajorWiki Deploy Status](MajorWiki-Deploy-Status.md) — deployment status and update workflow