Commit graph

23 commits

Author SHA1 Message Date
6e7a0ca21f wiki: add troubleshooting article — LoRA adapter GGUF conversion fails
Documents the gotcha where convert_hf_to_gguf.py crashes with
'config.json not found' because the training output directory holds
only the LoRA adapter, not a merged HF model. Includes inline
save_pretrained_merged() fix snippet, verification checklist, and
resume-pipeline-without-retraining pattern.

Discovered today during the MajorTwin v8c pipeline failure (Step 4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:22:59 -04:00
9de839066c wiki: add troubleshooting article — iOS Tailscale clients report HostName="localhost"
Documents the non-obvious failure mode where /etc/hosts generator scripts
using `tailscale status --json | jq '.HostName'` get poisoned by iOS
peers, which always report HostName as the literal string "localhost"
because iOS doesn't expose the device name to apps.

Includes the buggy and fixed jq filter (use .DNSName first label
instead), a real-world Postfix outage example, and a verification
checklist. Linked from troubleshooting index and SUMMARY.

Discovered while diagnosing a 24h Postfix outage on majordiscord.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:57:40 -04:00
063bfa53d7 Resolve Obsidian Sync conflicts on main 2026-04-29 22:47:46 -04:00
0996861512 wiki: add troubleshooting articles from MajorTwin v8 cycle
Two articles surfaced during the v8 deploy + eval on 2026-04-25:

- Ollama: `ollama run` with piped stdin bypasses the chat template and
  SYSTEM prompt — output looks like raw base-model completion. Caught
  during initial v8 smoke test. Fix: use /api/chat HTTP endpoint.

- rsync over Tailscale can hang in TCP teardown after the data has
  fully transferred. Verify with md5sum, then kill the hung pipeline.
  Includes a watcher-threshold gotcha (set below true file size, not
  above) and prevention tips.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 12:57:39 -04:00
ae563efc9e docs: add Pi-hole AI blocklist / claude.ai ERR_CONNECTION_REFUSED article
- New: 05-troubleshooting/networking/pihole-blocks-claude-desktop.md
  Covers diagnosis via FTL SQLite query log, gravity DB adlist lookup,
  fix via type-0 domainlist whitelist entry + pihole reloaddns, and
  why NULL blocking mode produces TCP refused instead of NXDOMAIN.
- Updated SUMMARY.md and 05-troubleshooting/index.md with new entry
2026-04-22 18:12:08 -04:00
2dbeb22ef9 wiki: add Ansible SSH Host Alias Bypass troubleshooting article
Documents why `ansible myhost -m ping` fails with Permission denied
while `ssh myhost` works — SSH Host blocks match on literal pattern,
not on resolved HostName, so `ansible_host: <IP>` bypasses the alias
and the declared IdentityFile never gets applied. Covers the portable
fix (ansible_ssh_private_key_file in host_vars), the symlink sidebar
for standardizing key names across control nodes, alternatives, and
a diagnosis checklist.

Also catches index.md up with the ansible-check-mode-false-positives
article that was already published but missing from the nav.
2026-04-21 09:15:22 -04:00
181c04bed8 wiki: add Fedora usrmerge ebtables blocker troubleshooting article
Documents the cosmetic but persistent warning during dnf upgrades:
  "/usr/sbin cannot be merged yet, /usr/sbin/ebtables points to
   /etc/alternatives/ebtables"

Stale update-alternatives symlinks (not rpm-owned) block Fedora's
/usr/sbin -> /usr/bin consolidation. Article covers root cause,
investigation steps, and the fix (tear down + re-register with
/usr/bin paths only). References the Ansible playbook
fix_ebtables_usrmerge.yml that implements this fleet-wide.

Applied 2026-04-19 across majorlab, majorhome, majormail, majordiscord.
2026-04-19 04:55:54 -04:00
b40e484aae Add 5 wiki articles from 2026-04-17/18 work
- ghost-smtp-mailgun-setup: two-system email config (newsletter API + transactional SMTP)
- firewalld-fleet-hardening: Fedora fleet firewall audit-and-harden pattern with Ansible
- clamav-fleet-deployment: fleet deployment with nice/ionice throttling + quarantine
- ansible-check-mode-false-positives: when: not ansible_check_mode guard for verify/assert tasks
- ghost-emailanalytics-lag-warning: submitted status, lag counter, fetchMissing skip explained
2026-04-18 11:13:39 -04:00
9c1a8c95d5 wiki: add claude-mem troubleshooting article for Claude Code 2.1 arg mismatch
claude-mem 12.1.3 passes --setting-sources with no value, which Claude Code
2.1.x rejects. Documents the silent summaryStored=null symptom, the real
error revealed under DEBUG logging, and the claude-shim workaround.
2026-04-17 10:21:21 -04:00
326c87421f wiki: add troubleshooting article on /var/run heartbeat reboot false alarm
Captures the majorlab incident where the backup watchdog emailed a missing
heartbeat after a kernel-update reboot wiped /var/run, even though the
backup had actually completed cleanly. Documents the tmpfs root cause and
the fix of storing heartbeats under /var/lib instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:11:24 -04:00
56f1014f73 Add troubleshooting article: wget/curl URLs with special characters
Covers shell quoting for URLs containing &, ?, #, and other characters
that Bash interprets as operators. Common gotcha when downloading from
CDNs with token-based URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:18:34 -04:00
84a1893e80 wiki: fix article count to 73, update frontmatter timestamps
Corrected inflated article count (was 76, actual is 73).
Updated domain breakdown and frontmatter timestamps from Obsidian.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:51:23 -04:00
daa771760b wiki: add WSL OpenSSH default shell + Ansible world-writable mount articles
Two new troubleshooting articles from today's MajorRig/MajorMac Ansible setup:
- Windows OpenSSH WSL default shell breaks remote SSH commands
- Ansible silently ignores ansible.cfg on WSL2 world-writable mounts

Article count: 76

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:23:02 -04:00
1bb872ef75 Add Ansible SSH timeout troubleshooting article
Documents the SSH keepalive fix for dnf upgrade timeouts on Fedora hosts,
plus the do-agent task guard fix. Also adds Ansible & Fleet Management
section to the troubleshooting index.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:22:48 -04:00
d37bd60a24 wiki: add systemd session scope failure troubleshooting article
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 11:22:44 -04:00
fb2e3f6168 wiki: add SELinux AVC chart, enriched alerts, new server setup, and pending articles; update indexes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 03:34:33 -04:00
59a5cc530e wiki: add Windows sshd and Ollama/Tailscale sleep articles; update indexes to 47
- 05-troubleshooting/networking/windows-sshd-stops-after-reboot.md
- 05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md
- SUMMARY.md, index.md, README.md: count 45 → 47, add 5 missing articles (3 from 2026-03-16 + 2 today)
- MajorWiki-Deploy-Status.md: session update 2026-03-17

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 21:20:15 -04:00
279c094afc wiki: add firewalld mail ports reset article + session updates
- New article: firewalld mail ports wiped after reload (IMAP + webmail outage)
- New article: Plex 4K codec compatibility (Apple TV)
- New article: mdadm RAID recovery after USB hub disconnect
- Updated yt-dlp article
- Updated all index files: SUMMARY.md, index.md, README.md, category indexes
- Article count: 41 → 42

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15 16:15:02 -04:00
0bcc2c822a wiki: add SELinux vmail and gitea-runner articles; update indexes
- New: SELinux Fixing Dovecot Mail Spool Context (/var/vmail)
  Corrected fix — mail_spool_t only, no dovecot_tmp_t on tmp/ dirs.
  Includes warning and recovery steps for the Postfix delivery outage.
- New: Gitea Actions Runner Boot Race Condition Fix
  network-online.target dependency, RestartSec=10, /etc/hosts workaround.
- Updated SUMMARY.md, index.md, README.md, 05-troubleshooting/index.md
- Article count: 37 → 39; MajorWiki-Deploy-Status updated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 22:49:01 -04:00
1d8be8669e troubleshooting: add Fail2ban IMAP self-ban article
Documents the 2026-03-14 incident where MajorAir's public IP was banned
by the postfix-sasl jail after repeated SASL auth failures, silently
blocking all IMAP connections from Spark Desktop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 21:57:01 -04:00
ca81761cb3 docs: add Docker & Caddy SELinux post-reboot recovery runbook
Add troubleshooting article covering the three-part failure mode on
Fedora with SELinux Enforcing: docker.socket disabled, ports 4443/8448
blocked, and httpd_can_network_connect off. Update index and SUMMARY.
2026-03-12 17:58:00 -04:00
59d25e589b vault backup: 2026-03-11 21:19:44 2026-03-11 21:19:44 -04:00
9fe9e6bac5 docs: add index.md for Troubleshooting section 2026-03-11 20:33:34 -04:00