Commit graph

49 commits

Author SHA1 Message Date
631d7e8bc5 Logwatch fleet article: add Fedora CA bundle diagnosis + bounce-source guidance
Documents three lessons from the 2026-05-10 fleet outage where the
Fedora half (majorhome, majorlab) had been silently failing to send
notification mail for days:

- Missing /etc/pki/tls/certs/ca-bundle.crt symlink (extracted bundle
  exists at /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem but the
  consumer-path symlink was lost during a ca-certificates package
  event). Diagnosis includes the cross-tool tell — dnf and curl break
  with the same path. Fix is a single ln -sfn.
- Methodology: Fedora and majormail log postfix to journald; Debian and
  Ubuntu log to /var/log/mail.log. Querying the wrong source returns
  false negatives for healthy hosts.
- Bounce-source addresses (Watchtower NOTIFICATION_EMAIL_FROM,
  fail2ban sender, root@<host>.localdomain) must resolve to real
  mailboxes — otherwise the first failed delivery generates
  bounce-of-bounce churn.

Also promoting the article from untracked to committed; it had been
authored on 2026-05-09 and not yet added to the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 12:08:15 -04:00
545df9f5c6 Add troubleshooting article: Claude Desktop MCP mass-disconnect from blocking SSH reboot
Documents the failure mode where issuing a synchronous `ssh host reboot`
through Claude Desktop's shell MCP poisons the local MCP transport when
the target severs its session before responding cleanly — eventually
force-disconnecting every MCP at once. Covers diagnostic chain, recovery,
fire-and-forget reboot patterns, and worked example from the 2026-05-10
majorhome AMD-card reboot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 01:28:11 -04:00
7c566cda50 Add: diagnosing Castopod posts that don't appear on Mastodon
Walks the four-step diagnostic chain (post created → activity delivered →
follower exists → notification semantics) for the common confusion where
a Castopod admin's auto-broadcast "doesn't show up" on a Mastodon account
they expected. Most cases are not federation bugs but the difference
between favouriting/boosting (no follow required) and following + the
fact that Mastodon notifications fire only for mentions/follows/favs/
boosts/etc., not for new posts from people you follow. Documents the bell
icon and `@`-mention escape hatches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:05:18 -04:00
1c17bdb60a Add: Castopod federation — stale cached avatar URL fix
When a remote actor updates their avatar, Mastodon (Paperclip) deletes the
old S3 object and stores only the new filename. Castopod 2.0.0 caches the
URL of every federated actor in cp_fediverse_actors and never refetches,
so its admin templates emit a dead link forever (the resulting S3 403 is
anti-enumeration, hiding what is really a 404). Article documents the
diagnosis pattern and three fixes (manual UPDATE, DELETE-and-refetch,
bulk audit), plus the Mastodon-side query for sourcing the correct URL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 01:51:18 -04:00
393df3cc45 Add: tuning Netdata web_log_1m_successful for redirect-heavy WordPress
The stock alarm definition counts only 1xx/2xx/304/401/429 as successful,
which causes false CRITICALs on WP sites where 301 canonicalization is
normal traffic (legacy /?p=NNNN, slug edits, host/TLS upgrades, etc.).
Article documents the root cause, verification steps via the access log,
and an in-place threshold retune that keeps the alarm useful as an
"obvious meltdown" floor while delegating real outage detection to the
5xx and 4xx alarms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 01:12:21 -04:00
3bcc58a805 services: add Mastodon --prune-profiles trap and recovery article
Documents the long-standing UX regression caused by
`tootctl media remove --prune-profiles` (and `--remove-headers`)
running on a schedule: cached remote avatars are deleted, but
Mastodon does not auto-refetch on profile view, so quiet remote
accounts stay broken indefinitely.

Article covers:
- The mutually-exclusive flag bug (silent skip if combined)
- Mastodon's actual avatar-refresh trigger model (Update activities,
  not profile views)
- A `refresh-my-follows.sh` pattern with a defensible WHERE clause
  (avatar NULL AND avatar_remote_url present) to avoid infinite
  retry on accounts whose origin has no avatar
- Why header_file_name IS NULL is a bad signal (~20% of users
  legitimately have no custom header)
- The cron decision: most admins should drop --prune-profiles
2026-05-07 12:01:47 -04:00
ca123b0312 wiki: add troubleshooting article — Ansible regex_search capture group fails in set_fact
Documents the gotcha hit during the 2026-05-06 update.yml refactor:
the second-positional-argument back-reference form of regex_search
('\1') doesn't reliably select capture groups when used inside
set_fact. The fix is to match the broader substring and use
.split()[0] (or [-1], etc.) to peel off the value, with a default()
bridge for the no-match case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 08:28:21 -04:00
ae864452f8 wiki: add Fail2Ban Digest Mode nav entry to SUMMARY.md 2026-05-02 17:17:04 -04:00
4126656c05 wiki: update fail2ban digest + netdata docker health + 3 new articles
- fail2ban-digest-mode-fleet: recidive-only email model, sshd now silent,
  defaults-debian.conf gotcha added
- netdata-docker-health-alarm-tuning: 30m/10m config, tuning history table
- New: wp-fail2ban-logpath-debian-ubuntu, lora-adapter-gguf-conversion-fails,
  tailscale-status-json-hostname-localhost-ios
- Various article updates and nav index refreshes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-02 14:58:07 -04:00
063bfa53d7 Resolve Obsidian Sync conflicts on main 2026-04-29 22:47:46 -04:00
0996861512 wiki: add troubleshooting articles from MajorTwin v8 cycle
Two articles surfaced during the v8 deploy + eval on 2026-04-25:

- Ollama: `ollama run` with piped stdin bypasses the chat template and
  SYSTEM prompt — output looks like raw base-model completion. Caught
  during initial v8 smoke test. Fix: use /api/chat HTTP endpoint.

- rsync over Tailscale can hang in TCP teardown after the data has
  fully transferred. Verify with md5sum, then kill the hung pipeline.
  Includes a watcher-threshold gotcha (set below true file size, not
  above) and prevention tips.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 12:57:39 -04:00
599080bf91 Add wiki article: Pi-hole v6 Group Management — Per-Client DNS Rules
Covers creating groups, assigning clients, scoping allow rules to
specific groups via API and CLI. Includes ghost attribution gotcha
(router DNS proxy + secondary DNS causes FTL cache mis-attribution)
and the fix (Pi-hole as sole DNS, remove secondary).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 19:59:22 -04:00
ae563efc9e docs: add Pi-hole AI blocklist / claude.ai ERR_CONNECTION_REFUSED article
- New: 05-troubleshooting/networking/pihole-blocks-claude-desktop.md
  Covers diagnosis via FTL SQLite query log, gravity DB adlist lookup,
  fix via type-0 domainlist whitelist entry + pihole reloaddns, and
  why NULL blocking mode produces TCP refused instead of NXDOMAIN.
- Updated SUMMARY.md and 05-troubleshooting/index.md with new entry
2026-04-22 18:12:08 -04:00
46ae9ac97e Add wiki article: Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
New article covering the conversion from per-ban email alerts to a
three-tier model (silent default, sshd/recidive immediate, daily digest).
Includes Ansible automation, gotchas with lineinfile regex collisions,
and fq-hostname override for clean subjects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 09:21:49 -04:00
f9c61fbac3 wiki: publish 3 unpushed articles and catch nav up
Articles from prior sessions that were written locally but never shipped:
- 02-selfhosting/cloud/aws-s3-cost-management.md — lifecycle rules, storage class selection, bucket inventory, unexpected-growth investigation
- 02-selfhosting/dns-networking/wake-on-lan-router-ssh.md — WOL magic packets via Asus router SSH + ether-wake, Ansible vault integration
- 02-selfhosting/services/claude-code-remote-control.md — mobile access to a persistent host Claude Code session

Nav updated (index.md + SUMMARY.md):
- Added Cloud subsection under Self-Hosting for aws-s3
- Added wake-on-lan and aws-s3 entries to SUMMARY
- Added claude-code-remote-control to index's Services section
- Added ansible-ssh-host-alias-bypass nav entry (article shipped in 2dbeb22)
- Article count 87 → 89, self-hosting 30 → 32, troubleshooting 33 → 34
2026-04-21 09:17:31 -04:00
76f29a46e5 SUMMARY + index: add 10 missing articles to nav, update counts to 86 2026-04-18 18:48:31 -04:00
b40e484aae Add 5 wiki articles from 2026-04-17/18 work
- ghost-smtp-mailgun-setup: two-system email config (newsletter API + transactional SMTP)
- firewalld-fleet-hardening: Fedora fleet firewall audit-and-harden pattern with Ansible
- clamav-fleet-deployment: fleet deployment with nice/ionice throttling + quarantine
- ansible-check-mode-false-positives: when: not ansible_check_mode guard for verify/assert tasks
- ghost-emailanalytics-lag-warning: submitted status, lag counter, fetchMissing skip explained
2026-04-18 11:13:39 -04:00
d616eb2afb SUMMARY: add 4 new articles to nav (nginx/apache bad-request, SSH hardening, Watchtower relay)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:07:05 -04:00
c0837b7e89 wiki: add fail2ban jail for Apache PHP webshell probes
Documents the 2026-04-09 scanner incident where 301-redirected PHP probes
bypassed the existing apache-404scan jail, leaving the scanner unbanned
and firing Netdata web_log_1m_redirects alerts. New jail catches 301/302/
403/404 PHP responses while excluding legitimate WordPress endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:17:24 -04:00
326c87421f wiki: add troubleshooting article on /var/run heartbeat reboot false alarm
Captures the majorlab incident where the backup watchdog emailed a missing
heartbeat after a kernel-update reboot wiped /var/run, even though the
backup had actually completed cleanly. Documents the tmpfs root cause and
the fix of storing heartbeats under /var/lib instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:11:24 -04:00
2c51e2b043 Fix merge conflict markers in SUMMARY.md frontmatter
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:25:28 -04:00
56f1014f73 Add troubleshooting article: wget/curl URLs with special characters
Covers shell quoting for URLs containing &, ?, #, and other characters
that Bash interprets as operators. Common gotcha when downloading from
CDNs with token-based URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:18:34 -04:00
84a1893e80 wiki: fix article count to 73, update frontmatter timestamps
Corrected inflated article count (was 76, actual is 73).
Updated domain breakdown and frontmatter timestamps from Obsidian.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:51:23 -04:00
daa771760b wiki: add WSL OpenSSH default shell + Ansible world-writable mount articles
Two new troubleshooting articles from today's MajorRig/MajorMac Ansible setup:
- Windows OpenSSH WSL default shell breaks remote SSH commands
- Ansible silently ignores ansible.cfg on WSL2 world-writable mounts

Article count: 76

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:23:02 -04:00
9a7e43e67d Add wiki article: Fail2ban WordPress login brute force jail
Access-log-based filter for wp-login.php brute force detection without
requiring the WP fail2ban plugin. Documents the backend=polling gotcha
on Ubuntu 24.04 and manual banning workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:04:13 -04:00
6d81e7f020 wiki: add 4 new articles from archive, merge 8 archive notes into existing articles (73 articles)
New: mdadm RAID rebuild, Mastodon instance tuning, Ventoy, Fedora networking/kernel recovery.
Merged: Glacier Deep Archive into rsync, SpamAssassin into hardening checklist,
OBS captions/VLC capture into OBS setup, yt-dlp subtitles/temp fix into yt-dlp.
Updated index.md, README.md, SUMMARY.md with 21 previously missing articles.
Fixed merge conflict in index.md Recently Updated table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:55:53 -04:00
2045c090c0 wiki: add UFW firewall management article and pending articles (63 articles)
New articles: UFW firewall management, Fail2ban Apache 404 scanner jail,
SELinux Fail2ban execmem fix, updating n8n Docker, Ansible SSH timeout
during dnf upgrade, n8n proxy X-Forwarded-For fix, macOS mirrored
notification alert loop. Updated dca→dcaprod reference in network overview.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 09:49:48 -04:00
ca7ddb67f2 wiki: add SELinux fail2ban execmem fix + pending articles
New article: selinux-fail2ban-execmem-fix.md — custom policy module
for fail2ban grep execmem denial on Fedora 43.

Also includes previously uncommitted:
- n8n-proxy-trust-x-forwarded-for.md
- fail2ban-apache-404-scanner-jail.md updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 09:51:33 -04:00
0df5ace1a2 wiki: add n8n reverse proxy X-Forwarded-For trust fix article
Documents the N8N_PROXY_HOPS env var needed for n8n behind Caddy/Nginx
when N8N_TRUST_PROXY alone is insufficient in newer versions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:48:01 -04:00
6dccc43d15 Add n8n Docker update guide
Covers version checking, pinned-tag update process, SQLite password
reset, and why Arcane may not catch updates when the latest tag lags
behind npm releases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 15:08:30 -04:00
MajorLinux
ed810ebdf9 Add: macOS repeating alert tone from mirrored iPhone notification 2026-03-30 07:15:09 -04:00
1bb872ef75 Add Ansible SSH timeout troubleshooting article
Documents the SSH keepalive fix for dnf upgrade timeouts on Fedora hosts,
plus the do-agent task guard fix. Also adds Ansible & Fleet Management
section to the troubleshooting index.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:22:48 -04:00
23a35e021b wiki: add fail2ban apache 404 scanner jail article
New guide for custom access-log-based fail2ban jail that catches
rapid-fire 404 vulnerability scanners missed by default error-log jails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:22:19 -04:00
9acd083577 wiki: add fail2ban UFW rule bloat and Apache dirscan jail articles (56 articles)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:54:06 -04:00
cfaee5cf43 wiki: document Nextcloud AIO 20h unhealthy incident and watchdog cron fix
Add troubleshooting article for the 2026-03-27 incident where PHP-FPM
hung after the nightly update cycle. Update the Netdata Docker alarm
tuning article with the dedicated Nextcloud alarm split and the new
watchdog cron deployed to majorlab. (54 articles)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:52:49 -04:00
8c22ee708d merge: resolve conflicts, add SELinux AVC chart article; update indexes to 53
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 03:36:49 -04:00
fb2e3f6168 wiki: add SELinux AVC chart, enriched alerts, new server setup, and pending articles; update indexes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 03:34:33 -04:00
0e640a3fff wiki: add ClamAV safe scheduling article; update Netdata new server setup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 03:36:49 -04:00
c4d3f8e974 wiki: add Tailscale SSH reauth article; update Netdata Docker alarm tuning (50 articles)
- New: Tailscale SSH unexpected re-authentication prompt — diagnosis and fix
- Updated: netdata-docker-health-alarm-tuning — add delay: up 3m to suppress
  Nextcloud AIO PHP-FPM ~90s startup false alerts; update settings table and notes
- Updated: 05-troubleshooting/index.md and SUMMARY.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 00:12:52 -04:00
4d59856c1e wiki: add Netdata new server deployment guide (49 articles) 2026-03-18 11:00:41 -04:00
38fe720e63 wiki: add Netdata Docker health alarm tuning article; update indexes to 48
- 02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md — new
- lookup extended to 5m average, delay: down 5m to prevent Nextcloud AIO update flapping
- SUMMARY.md, index.md, README.md, deploy status updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 00:10:36 -04:00
59a5cc530e wiki: add Windows sshd and Ollama/Tailscale sleep articles; update indexes to 47
- 05-troubleshooting/networking/windows-sshd-stops-after-reboot.md
- 05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md
- SUMMARY.md, index.md, README.md: count 45 → 47, add 5 missing articles (3 from 2026-03-16 + 2 today)
- MajorWiki-Deploy-Status.md: session update 2026-03-17

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 21:20:15 -04:00
e8598cfac8 wiki: add WSL2 backup, Fedora43 training env, Ansible upgrades, firewalld mail ports articles; update indexes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 16:47:02 -04:00
279c094afc wiki: add firewalld mail ports reset article + session updates
- New article: firewalld mail ports wiped after reload (IMAP + webmail outage)
- New article: Plex 4K codec compatibility (Apple TV)
- New article: mdadm RAID recovery after USB hub disconnect
- Updated yt-dlp article
- Updated all index files: SUMMARY.md, index.md, README.md, category indexes
- Article count: 41 → 42

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 16:15:02 -04:00
0bcc2c822a wiki: add SELinux vmail and gitea-runner articles; update indexes
- New: SELinux Fixing Dovecot Mail Spool Context (/var/vmail)
  Corrected fix — mail_spool_t only, no dovecot_tmp_t on tmp/ dirs.
  Includes warning and recovery steps for the Postfix delivery outage.
- New: Gitea Actions Runner Boot Race Condition Fix
  network-online.target dependency, RestartSec=10, /etc/hosts workaround.
- Updated SUMMARY.md, index.md, README.md, 05-troubleshooting/index.md
- Article count: 37 → 39; MajorWiki-Deploy-Status updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 22:49:01 -04:00
1d8be8669e troubleshooting: add Fail2ban IMAP self-ban article
Documents the 2026-03-14 incident where MajorAir's public IP was banned
by the postfix-sasl jail after repeated SASL auth failures, silently
blocking all IMAP connections from Spark Desktop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 21:57:01 -04:00
ca81761cb3 docs: add Docker & Caddy SELinux post-reboot recovery runbook
Add troubleshooting article covering the three-part failure mode on
Fedora with SELinux Enforcing: docker.socket disabled, ports 4443/8448
blocked, and httpd_can_network_connect off. Update index and SUMMARY.
2026-03-12 17:58:00 -04:00
f256ecc482 docs: update SUMMARY.md with explicit subfolder cross-links for nested articles 2026-03-11 22:02:05 -04:00
6166911725 docs: add SUMMARY.md for literate-nav with wildcard section listings 2026-03-11 21:37:56 -04:00