wiki: update fail2ban digest + netdata docker health + 3 new articles
- fail2ban-digest-mode-fleet: recidive-only email model, sshd now silent, defaults-debian.conf gotcha added - netdata-docker-health-alarm-tuning: 30m/10m config, tuning history table - New: wp-fail2ban-logpath-debian-ubuntu, lora-adapter-gguf-conversion-fails, tailscale-status-json-hostname-localhost-ios - Various article updates and nav index refreshes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f40f497b46
commit
4126656c05
21 changed files with 567 additions and 35 deletions
|
|
@ -10,7 +10,7 @@ tags:
|
||||||
- majorrig
|
- majorrig
|
||||||
status: published
|
status: published
|
||||||
created: 2026-03-16
|
created: 2026-03-16
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
|
|
||||||
# WSL2 Backup via PowerShell Scheduled Task
|
# WSL2 Backup via PowerShell Scheduled Task
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ tags:
|
||||||
- remote-access
|
- remote-access
|
||||||
status: published
|
status: published
|
||||||
created: 2026-03-08
|
created: 2026-03-08
|
||||||
updated: 2026-04-22T09:20
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
|
|
||||||
# SSH Config and Key Management
|
# SSH Config and Key Management
|
||||||
|
|
|
||||||
|
|
@ -7,7 +7,7 @@ tags:
|
||||||
- asus
|
- asus
|
||||||
- ssh
|
- ssh
|
||||||
created: 2026-04-19
|
created: 2026-04-19
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
|
|
||||||
# Wake-on-LAN via Router SSH
|
# Wake-on-LAN via Router SSH
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-04-13T10:15
|
created: 2026-04-13T10:15
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# 🏠 Self-Hosting & Homelab
|
# 🏠 Self-Hosting & Homelab
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,17 @@
|
||||||
---
|
---
|
||||||
title: "Tuning Netdata Docker Health Alarms to Prevent Update Flapping"
|
title: Tuning Netdata Docker Health Alarms to Prevent Update Flapping
|
||||||
domain: selfhosting
|
domain: selfhosting
|
||||||
category: monitoring
|
category: monitoring
|
||||||
tags: [netdata, docker, nextcloud, alarms, health, monitoring]
|
tags:
|
||||||
|
- netdata
|
||||||
|
- docker
|
||||||
|
- nextcloud
|
||||||
|
- alarms
|
||||||
|
- health
|
||||||
|
- monitoring
|
||||||
status: published
|
status: published
|
||||||
created: 2026-03-18
|
created: 2026-03-18
|
||||||
updated: 2026-03-28
|
updated: 2026-05-02T11:04
|
||||||
---
|
---
|
||||||
|
|
||||||
# Tuning Netdata Docker Health Alarms to Prevent Update Flapping
|
# Tuning Netdata Docker Health Alarms to Prevent Update Flapping
|
||||||
|
|
@ -61,9 +67,9 @@ chart labels: container_name=!nextcloud-aio-nextcloud *
|
||||||
|
|
||||||
### Dedicated Nextcloud AIO Alarm
|
### Dedicated Nextcloud AIO Alarm
|
||||||
|
|
||||||
Added 2026-03-23, updated 2026-03-28. The `nextcloud-aio-nextcloud` container needs a more lenient window than other containers. Its healthcheck (`/healthcheck.sh`) verifies PostgreSQL connectivity (port 5432) and PHP-FPM (port 9000). PHP-FPM takes ~90 seconds to warm up after a normal restart — but during nightly AIO update cycles, the full startup (occ upgrade, app updates, migrations) can take 5+ minutes. On 2026-03-27, a startup hung and left the container unhealthy for 20 hours until the next nightly cycle replaced it.
|
Added 2026-03-23, updated 2026-05-02. The `nextcloud-aio-nextcloud` container needs a more lenient window than other containers. Its healthcheck (`/healthcheck.sh`) verifies PostgreSQL connectivity (port 5432) and PHP-FPM (port 9000). PHP-FPM takes ~90 seconds to warm up after a normal restart — but during nightly AIO update cycles, the full startup (occ upgrade, app updates, migrations) can take 5+ minutes. On 2026-03-27, a startup hung and left the container unhealthy for 20 hours until the next nightly cycle replaced it.
|
||||||
|
|
||||||
The dedicated alarm uses a 10-minute lookup window and 10-minute delay to absorb normal startup, while still catching sustained failures:
|
The dedicated alarm uses a 30-minute lookup window and 10-minute delay to absorb normal startup and update cycles (~40 minutes total grace), while still catching sustained failures:
|
||||||
|
|
||||||
```ini
|
```ini
|
||||||
# Dedicated alarm for nextcloud-aio-nextcloud — lenient window to absorb nightly update cycle
|
# Dedicated alarm for nextcloud-aio-nextcloud — lenient window to absorb nightly update cycle
|
||||||
|
|
@ -76,15 +82,23 @@ template: docker_nextcloud_unhealthy
|
||||||
component: Docker
|
component: Docker
|
||||||
units: status
|
units: status
|
||||||
every: 30s
|
every: 30s
|
||||||
lookup: average -10m of unhealthy
|
lookup: average -30m of unhealthy
|
||||||
chart labels: container_name=nextcloud-aio-nextcloud
|
chart labels: container_name=nextcloud-aio-nextcloud
|
||||||
warn: $this > 0
|
warn: $this >= 1
|
||||||
delay: up 10m down 5m multiplier 1.5 max 30m
|
delay: up 10m down 5m multiplier 1.5 max 30m
|
||||||
summary: Nextcloud container health sustained
|
summary: Nextcloud container health sustained
|
||||||
info: nextcloud-aio-nextcloud has been unhealthy for a sustained period — not a transient update blip
|
info: nextcloud-aio-nextcloud has been continuously unhealthy for 30+ minutes — not a transient update blip
|
||||||
to: sysadmin
|
to: sysadmin
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Tuning history:**
|
||||||
|
|
||||||
|
| Date | Lookup | Delay | Trigger | Notes |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| 2026-03-23 | 35m | 35m | Initial split from general alarm | Absorbed PHP-FPM warm-up |
|
||||||
|
| 2026-04-29 | 15m | 5m | Backup blip (~6m) never triggered | Tightened after stability |
|
||||||
|
| 2026-05-02 | 30m | 10m | 15m still too aggressive for update cycles | ~40m total grace; catches real outages |
|
||||||
|
|
||||||
## Watchdog Cron: Auto-Restart on Sustained Unhealthy
|
## Watchdog Cron: Auto-Restart on Sustained Unhealthy
|
||||||
|
|
||||||
If the Nextcloud container stays unhealthy for more than 1 hour (well past any normal startup window), a cron watchdog on majorlab auto-restarts it and logs the event. This was added 2026-03-28 after an incident where the container sat unhealthy for 20 hours until the next nightly backup cycle replaced it.
|
If the Nextcloud container stays unhealthy for more than 1 hour (well past any normal startup window), a cron watchdog on majorlab auto-restarts it and logs the event. This was added 2026-03-28 after an incident where the container sat unhealthy for 20 hours until the next nightly backup cycle replaced it.
|
||||||
|
|
|
||||||
|
|
@ -11,7 +11,7 @@ tags:
|
||||||
- cron
|
- cron
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-18
|
created: 2026-04-18
|
||||||
updated: 2026-04-18T11:13
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# ClamAV Fleet Deployment with Ansible
|
# ClamAV Fleet Deployment with Ansible
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,18 @@
|
||||||
---
|
---
|
||||||
title: "Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts"
|
title: Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
|
||||||
domain: selfhosting
|
domain: selfhosting
|
||||||
category: security
|
category: security
|
||||||
tags: [fail2ban, security, email, ansible, fleet, cron, digest]
|
tags:
|
||||||
|
- fail2ban
|
||||||
|
- security
|
||||||
|
- email
|
||||||
|
- ansible
|
||||||
|
- fleet
|
||||||
|
- cron
|
||||||
|
- digest
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-22
|
created: 2026-04-22
|
||||||
updated: 2026-04-22
|
updated: 2026-05-02T14:56
|
||||||
---
|
---
|
||||||
# Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
|
# Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
|
||||||
|
|
||||||
|
|
@ -21,11 +28,11 @@ Three tiers replace the firehose:
|
||||||
|
|
||||||
| Tier | Jails | Action | Why |
|
| Tier | Jails | Action | Why |
|
||||||
|------|-------|--------|-----|
|
|------|-------|--------|-----|
|
||||||
| **Immediate email** | `sshd`, `recidive` | `action_mwl` | Security-critical — someone is actively targeting auth or is a repeat offender |
|
| **Immediate email** | `recidive` | `action_mwl` | Repeat offenders only — someone has been banned multiple times across jails |
|
||||||
| **Silent ban** | Everything else | `action_` (default) | Ban happens, firewall rule applied, no email sent |
|
| **Silent ban** | Everything else | `action_` (default) | Ban happens, firewall rule applied, no email sent |
|
||||||
| **Daily digest** | All jails | Cron script at 08:00 UTC | One summary email per host with ban counts across all jails |
|
| **Daily digest** | All jails | Cron script at 08:00 UTC | One summary email per host with ban counts across all jails |
|
||||||
|
|
||||||
This reduces email volume from hundreds per day to ~10 (one digest per host + occasional sshd/recidive alerts).
|
This reduces email volume from hundreds per day to ~10 (one digest per host + occasional recidive alerts).
|
||||||
|
|
||||||
## jail.local Configuration
|
## jail.local Configuration
|
||||||
|
|
||||||
|
|
@ -40,18 +47,20 @@ action = %(action_)s
|
||||||
|
|
||||||
This overrides the stock `action_mwl` for all jails. Bans still happen — the firewall rule is applied — but no email is sent.
|
This overrides the stock `action_mwl` for all jails. Bans still happen — the firewall rule is applied — but no email is sent.
|
||||||
|
|
||||||
### Keep immediate alerts for critical jails
|
### Keep immediate alerts for recidive only
|
||||||
|
|
||||||
```ini
|
```ini
|
||||||
[sshd]
|
[sshd]
|
||||||
enabled = true
|
enabled = true
|
||||||
action = %(action_mwl)s
|
action = %(action_)s
|
||||||
|
|
||||||
[recidive]
|
[recidive]
|
||||||
enabled = true
|
enabled = true
|
||||||
action = %(action_mwl)s
|
action = %(action_mwl)s
|
||||||
```
|
```
|
||||||
|
|
||||||
|
> **Updated 2026-05-02:** sshd was moved to silent (`action_`). Only recidive (repeat offenders) now triggers immediate email. sshd bans are captured in the daily digest.
|
||||||
|
|
||||||
### Clean up email subjects with fq-hostname
|
### Clean up email subjects with fq-hostname
|
||||||
|
|
||||||
By default, fail2ban uses the system FQDN in email subjects. On Tailscale hosts, this produces ugly subjects like `[Fail2Ban] sshd: banned 1.2.3.4 on MajorToot.tail7f2d9.ts.net`. Override it in `[DEFAULT]`:
|
By default, fail2ban uses the system FQDN in email subjects. On Tailscale hosts, this produces ugly subjects like `[Fail2Ban] sshd: banned 1.2.3.4 on MajorToot.tail7f2d9.ts.net`. Override it in `[DEFAULT]`:
|
||||||
|
|
@ -91,8 +100,9 @@ The playbook `configure_fail2ban_digest.yml` deploys the full digest model fleet
|
||||||
### What it does
|
### What it does
|
||||||
|
|
||||||
1. Deploys a Python helper script that performs **section-aware editing** of `jail.local` (see gotchas below)
|
1. Deploys a Python helper script that performs **section-aware editing** of `jail.local` (see gotchas below)
|
||||||
2. Sets `action = %(action_)s` in `[DEFAULT]`
|
2. Sets `action = %(action_)s` in `[DEFAULT]` and `[sshd]`
|
||||||
3. Sets `action = %(action_mwl)s` in `[sshd]` and `[recidive]`
|
3. Sets `action = %(action_mwl)s` in `[recidive]`
|
||||||
|
4. Removes stale `action = %(action_mwl)s` from `defaults-debian.conf` if present
|
||||||
4. Sets `fq-hostname` per host using an override dict
|
4. Sets `fq-hostname` per host using an override dict
|
||||||
5. Deploys the digest script from a Jinja2 template
|
5. Deploys the digest script from a Jinja2 template
|
||||||
6. Creates the cron job via `ansible.builtin.cron`
|
6. Creates the cron job via `ansible.builtin.cron`
|
||||||
|
|
@ -143,6 +153,14 @@ option 'action' in section 'DEFAULT' already exists
|
||||||
|
|
||||||
The Python editor script handles this by replacing existing keys rather than appending.
|
The Python editor script handles this by replacing existing keys rather than appending.
|
||||||
|
|
||||||
|
### defaults-debian.conf overrides jail.local
|
||||||
|
|
||||||
|
On Debian/Ubuntu, `/etc/fail2ban/jail.d/defaults-debian.conf` is loaded **after** `jail.local`. If it contains `action = %(action_mwl)s`, it silently overrides your silent default — every jail sends email on every ban. The Ansible playbook now removes this line automatically. If you see per-ban emails after deploying digest mode, check this file first:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
grep action /etc/fail2ban/jail.d/defaults-debian.conf
|
||||||
|
```
|
||||||
|
|
||||||
### fq-hostname scope
|
### fq-hostname scope
|
||||||
|
|
||||||
Setting `fq-hostname` in `[DEFAULT]` affects all action templates that use the `<fq-hostname>` tag — including both immediate emails and the digest subject. This is the desired behavior, but be aware that it overrides the system hostname globally within fail2ban.
|
Setting `fq-hostname` in `[DEFAULT]` affects all action templates that use the `<fq-hostname>` tag — including both immediate emails and the digest subject. This is the desired behavior, but be aware that it overrides the system hostname globally within fail2ban.
|
||||||
|
|
|
||||||
151
02-selfhosting/security/wp-fail2ban-logpath-debian-ubuntu.md
Normal file
151
02-selfhosting/security/wp-fail2ban-logpath-debian-ubuntu.md
Normal file
|
|
@ -0,0 +1,151 @@
|
||||||
|
---
|
||||||
|
title: "wp-fail2ban Plugin Logpath on Debian/Ubuntu (auth.log, not syslog)"
|
||||||
|
domain: selfhosting
|
||||||
|
category: security
|
||||||
|
tags: [fail2ban, wordpress, wp-fail2ban, debugging, gotcha, debian, ubuntu]
|
||||||
|
status: published
|
||||||
|
created: 2026-04-30
|
||||||
|
updated: 2026-04-30
|
||||||
|
---
|
||||||
|
# wp-fail2ban Plugin Logpath on Debian/Ubuntu (auth.log, not syslog)
|
||||||
|
|
||||||
|
## The Problem
|
||||||
|
|
||||||
|
You install the [WP fail2ban](https://wordpress.org/plugins/wp-fail2ban/) WordPress plugin, configure the fleet-standard `wordpress-hard`, `wordpress-soft`, and `wordpress-extra` jails, and… nothing. Weeks pass. `fail2ban-client status wordpress-hard` reports `Total failed: 0, Total banned: 0`. Your site is being attacked, but the jails are dead.
|
||||||
|
|
||||||
|
Meanwhile the `wordpress-login` jail (which reads Apache access logs for `POST /wp-login.php` directly) is happily catching brute-forcers. So the problem isn't fail2ban itself — it's specifically the wp-fail2ban-plugin-derived jails.
|
||||||
|
|
||||||
|
## The Cause
|
||||||
|
|
||||||
|
The wp-fail2ban plugin emits events via PHP's `syslog()` call with facility `LOG_AUTH`. On Debian/Ubuntu, rsyslog routes the `auth` facility to **`/var/log/auth.log`**, NOT `/var/log/syslog`. On RHEL/Fedora it's `/var/log/secure`.
|
||||||
|
|
||||||
|
A lot of tutorials, ansible-galaxy roles, and copy-paste config snippets specify:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
logpath = /var/log/syslog
|
||||||
|
```
|
||||||
|
|
||||||
|
That's wrong on Debian/Ubuntu. The events never land there, so the filter regex has nothing to match, so the jail catches zero events forever. Silently.
|
||||||
|
|
||||||
|
## Diagnostic Steps
|
||||||
|
|
||||||
|
If a `wordpress-{hard,soft,extra}` jail shows `Total failed: 0` over a long window despite the plugin being active and the site getting attacked:
|
||||||
|
|
||||||
|
**1. Check what the jail thinks it's watching:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo fail2ban-client status wordpress-hard | grep "File list"
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Check where wp-fail2ban events actually land:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo grep -c "wordpress(" /var/log/auth.log /var/log/syslog /var/log/secure 2>/dev/null
|
||||||
|
```
|
||||||
|
|
||||||
|
You'll see something like:
|
||||||
|
|
||||||
|
```
|
||||||
|
/var/log/auth.log:314
|
||||||
|
/var/log/syslog:0
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. If the jail's `File list` ≠ the file with events, fix the `logpath`.**
|
||||||
|
|
||||||
|
A real event line on Debian/Ubuntu looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
2026-04-18T23:28:21.027004-04:00 hostname wordpress(example.com)[719989]: XML-RPC authentication failure for someone from 1.2.3.4
|
||||||
|
```
|
||||||
|
|
||||||
|
The `wordpress(domain)[pid]` syslog tag is the giveaway — those are wp-fail2ban events.
|
||||||
|
|
||||||
|
## The Fix
|
||||||
|
|
||||||
|
Edit the jail blocks in `/etc/fail2ban/jail.local` (or your Ansible source for the jail) and set:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[wordpress-hard]
|
||||||
|
enabled = true
|
||||||
|
port = http,https
|
||||||
|
filter = wordpress-hard
|
||||||
|
logpath = /var/log/auth.log
|
||||||
|
maxretry = 1
|
||||||
|
findtime = 60
|
||||||
|
bantime = 30d
|
||||||
|
backend = polling
|
||||||
|
|
||||||
|
[wordpress-soft]
|
||||||
|
enabled = true
|
||||||
|
port = http,https
|
||||||
|
filter = wordpress-soft
|
||||||
|
logpath = /var/log/auth.log
|
||||||
|
maxretry = 5
|
||||||
|
findtime = 60
|
||||||
|
bantime = 30d
|
||||||
|
backend = polling
|
||||||
|
|
||||||
|
[wordpress-extra]
|
||||||
|
enabled = true
|
||||||
|
port = http,https
|
||||||
|
filter = wordpress-extra
|
||||||
|
logpath = /var/log/auth.log
|
||||||
|
maxretry = 5
|
||||||
|
findtime = 60
|
||||||
|
bantime = 30d
|
||||||
|
backend = polling
|
||||||
|
```
|
||||||
|
|
||||||
|
Then:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo fail2ban-client -t # validate
|
||||||
|
sudo fail2ban-client reload
|
||||||
|
sudo fail2ban-client status wordpress-hard | grep "File list"
|
||||||
|
# should now show /var/log/auth.log
|
||||||
|
```
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
You can prove the filter regex actually matches your real events without waiting for an attack — run `fail2ban-regex` against the rotated log:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo fail2ban-regex /var/log/auth.log.1 /etc/fail2ban/filter.d/wordpress-hard.conf | grep -E "Failregex:|Lines:"
|
||||||
|
```
|
||||||
|
|
||||||
|
Healthy output looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
Failregex: 81 total
|
||||||
|
Lines: 13008 lines, 0 ignored, 81 matched, 12927 missed
|
||||||
|
```
|
||||||
|
|
||||||
|
If you see `Failregex: 0 total`, the filter regex doesn't match what the plugin actually emits — which is a different bug (filter version skew vs. plugin version), not the logpath gotcha. Investigate `/etc/fail2ban/filter.d/wordpress-{hard,soft}.conf` against actual event lines.
|
||||||
|
|
||||||
|
> **Note:** On a freshly-fixed jail, counters will sit at `Total failed: 0` for a while — the `polling` backend starts at the file's current EOF, so old events aren't retroactively counted. New events from the moment of `reload` onward will accumulate. Allow a few days of normal attack traffic before declaring the fix broken.
|
||||||
|
|
||||||
|
## Distribution Cheat Sheet
|
||||||
|
|
||||||
|
| Distro family | wp-fail2ban events land in |
|
||||||
|
|---|---|
|
||||||
|
| Debian / Ubuntu | `/var/log/auth.log` |
|
||||||
|
| RHEL / CentOS / Fedora | `/var/log/secure` |
|
||||||
|
| systemd-journal-only systems | `journalctl SYSLOG_FACILITY=4` (use `backend = systemd` + `journalmatch = SYSLOG_FACILITY=4`) |
|
||||||
|
|
||||||
|
If you have a mixed fleet, parameterize the path:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# Ansible vars
|
||||||
|
wp_fail2ban_log_path: "{{ '/var/log/auth.log' if ansible_os_family == 'Debian' else '/var/log/secure' }}"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Why wordpress-login Is Unaffected
|
||||||
|
|
||||||
|
The `wordpress-login` jail is a different beast — it reads `/var/log/apache2/access.log` directly and matches `^<HOST> -.*"POST /wp-login.php` via the `wordpress-login` filter. No plugin involved, no syslog facility involved. So a host can have `wordpress-login` working perfectly while `wordpress-{hard,soft,extra}` are silently dead. Don't let a healthy `wordpress-login` reassure you that the rest of the wp-fail2ban stack is also fine.
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
- [[fail2ban-wordpress-login-jail]] — the access-log layer that catches WP brute force without any plugin dependency
|
||||||
|
- [[fail2ban-apache-bad-request-jail]]
|
||||||
|
- [[fail2ban-apache-php-probe-jail]]
|
||||||
|
- [[clamav-fleet-deployment]]
|
||||||
|
|
@ -10,7 +10,7 @@ tags:
|
||||||
- docker
|
- docker
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-02
|
created: 2026-04-02
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
|
|
||||||
# Mastodon Instance Tuning
|
# Mastodon Instance Tuning
|
||||||
|
|
|
||||||
|
|
@ -11,7 +11,7 @@ tags:
|
||||||
- troubleshooting
|
- troubleshooting
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-18
|
created: 2026-04-18
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# Ansible Check Mode False Positives in Verify/Assert Tasks
|
# Ansible Check Mode False Positives in Verify/Assert Tasks
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,119 @@
|
||||||
|
---
|
||||||
|
title: "LoRA adapter — GGUF conversion fails with 'config.json not found'"
|
||||||
|
domain: troubleshooting
|
||||||
|
category: gpu-display
|
||||||
|
tags: [lora, qlora, gguf, llama.cpp, unsloth, fine-tuning, qwen]
|
||||||
|
status: published
|
||||||
|
created: 2026-04-30
|
||||||
|
updated: 2026-04-30
|
||||||
|
---
|
||||||
|
|
||||||
|
# LoRA adapter — GGUF conversion fails with 'config.json not found'
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
After a QLoRA fine-tune, you point `llama.cpp/convert_hf_to_gguf.py` at the training output directory and it crashes immediately:
|
||||||
|
|
||||||
|
```
|
||||||
|
FileNotFoundError: [Errno 2] No such file or directory:
|
||||||
|
'/path/to/training-runs/<run>/final/config.json'
|
||||||
|
```
|
||||||
|
|
||||||
|
The output directory looks fine — it contains:
|
||||||
|
|
||||||
|
```
|
||||||
|
adapter_config.json
|
||||||
|
adapter_model.safetensors (~150 MB for a 7B base)
|
||||||
|
chat_template.jinja
|
||||||
|
tokenizer_config.json
|
||||||
|
tokenizer.json
|
||||||
|
```
|
||||||
|
|
||||||
|
But no `config.json`, and `adapter_model.safetensors` is 150 MB — way smaller than the ~14 GB you'd expect for a full Qwen2.5-7B 16-bit checkpoint.
|
||||||
|
|
||||||
|
## Root cause
|
||||||
|
|
||||||
|
`model.save_pretrained()` after a LoRA/QLoRA train saves **only the adapter weights**, not a merged full-precision model. `convert_hf_to_gguf.py` expects a full HuggingFace model directory — it reads `config.json` to identify the architecture. Adapter-only directories don't have one.
|
||||||
|
|
||||||
|
You need to merge the LoRA adapter into the base model first, then point the GGUF converter at the merged dir.
|
||||||
|
|
||||||
|
## Solution
|
||||||
|
|
||||||
|
### Quick fix — inline merge step
|
||||||
|
|
||||||
|
Insert this block between training completion and `convert_hf_to_gguf.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from unsloth import FastLanguageModel
|
||||||
|
|
||||||
|
adapter = "/path/to/training-runs/<run>/final"
|
||||||
|
merged = "/path/to/training-runs/<run>/merged"
|
||||||
|
|
||||||
|
model, tok = FastLanguageModel.from_pretrained(
|
||||||
|
model_name=adapter,
|
||||||
|
max_seq_length=2048,
|
||||||
|
load_in_4bit=True,
|
||||||
|
)
|
||||||
|
model.save_pretrained_merged(merged, tok, save_method="merged_16bit")
|
||||||
|
```
|
||||||
|
|
||||||
|
Then run the GGUF converter against the **merged** dir, not the adapter dir:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 llama.cpp/convert_hf_to_gguf.py /path/to/training-runs/<run>/merged \
|
||||||
|
--outfile model-f16.gguf --outtype f16
|
||||||
|
```
|
||||||
|
|
||||||
|
The merged dir will contain `config.json`, `model-00001-of-00004.safetensors` (multiple shards totaling the full base model size), `generation_config.json`, etc.
|
||||||
|
|
||||||
|
### Cleaner fix — use a wrapper
|
||||||
|
|
||||||
|
If you do this often, encapsulate it:
|
||||||
|
|
||||||
|
1. Wrapper Python script accepts `--adapter`, `--output`, `--skip-merge`, `--all-quants`
|
||||||
|
2. Step 1: load adapter via `FastLanguageModel.from_pretrained()`, call `save_pretrained_merged()`
|
||||||
|
3. Step 2: subprocess `convert_hf_to_gguf.py` on the merged dir
|
||||||
|
4. Step 3: subprocess `llama-quantize` for each requested quant
|
||||||
|
|
||||||
|
This is what `~/corpus/scripts/convert_gguf.py` does on MajorRig (rewritten 2026-04-09 for the MajorTwin v7b cycle).
|
||||||
|
|
||||||
|
## Why this trips people up
|
||||||
|
|
||||||
|
- Unsloth and PEFT both save adapter-only by default after `trainer.save_model()` or `model.save_pretrained()`. There's no warning that downstream tools expect a merged model.
|
||||||
|
- The training output **looks** complete — there's a `tokenizer.json`, a `chat_template.jinja`, and a non-trivial `.safetensors`. It feels like a checkpoint.
|
||||||
|
- A pipeline that uses `convert_gguf.py` (with merge) once and then someone reimplements Step 4 inline (skipping the wrapper) will silently lose the merge step. This is what happened in MajorTwin v8c (Apr 30, 2026) — see [[majortwin-v8b-plan#Pipeline Bug + Fix (2026-04-30)]].
|
||||||
|
|
||||||
|
## Verification checklist
|
||||||
|
|
||||||
|
After training, before running the GGUF converter, verify the directory you're pointing at:
|
||||||
|
|
||||||
|
| File | Adapter-only dir | Merged dir |
|
||||||
|
|---|---|---|
|
||||||
|
| `adapter_config.json` | ✅ | ❌ |
|
||||||
|
| `adapter_model.safetensors` | ✅ (~150 MB / 7B) | ❌ |
|
||||||
|
| `config.json` | ❌ | ✅ |
|
||||||
|
| `model-*.safetensors` (sharded) | ❌ | ✅ (~14 GB / 7B) |
|
||||||
|
| `generation_config.json` | ❌ | ✅ |
|
||||||
|
| `tokenizer.json` | ✅ | ✅ |
|
||||||
|
|
||||||
|
If you see only the left column, you need to merge before converting.
|
||||||
|
|
||||||
|
## Resuming a failed pipeline without re-training
|
||||||
|
|
||||||
|
The adapter is small and self-contained. If your pipeline crashes at the GGUF step, you do NOT need to retrain — the LoRA adapter at `<run>/final/` is intact. Write a resume wrapper that runs only:
|
||||||
|
|
||||||
|
1. Merge (`save_pretrained_merged`)
|
||||||
|
2. F16 conversion (`convert_hf_to_gguf.py`)
|
||||||
|
3. Quantization (`llama-quantize`)
|
||||||
|
4. Deploy
|
||||||
|
|
||||||
|
This saves the cost of however many GPU-hours the training took. See `~/corpus/scripts/resume_v8c_step4.sh` on MajorRig for an example.
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
- [[qwen-14b-oom-3080ti]] — base model size choice on a 12GB GPU
|
||||||
|
- [[majortwin-v8b-plan]] — v8c pipeline architecture and resume
|
||||||
|
|
||||||
|
## Maintenance
|
||||||
|
|
||||||
|
- 2026-04-30 — Created after MajorTwin v8c pipeline failed Step 4. Root-caused, patched, resumed.
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-03-15T06:37
|
created: 2026-03-15T06:37
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T10:41
|
||||||
---
|
---
|
||||||
# 🔧 General Troubleshooting
|
# 🔧 General Troubleshooting
|
||||||
|
|
||||||
|
|
@ -8,12 +8,14 @@ Practical fixes for common Linux, networking, and application problems.
|
||||||
|
|
||||||
## 🖥️ GPU & AI
|
## 🖥️ GPU & AI
|
||||||
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](gpu-display/qwen-14b-oom-3080ti.md)
|
- [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](gpu-display/qwen-14b-oom-3080ti.md)
|
||||||
|
- [LoRA adapter — GGUF conversion fails with 'config.json not found'](gpu-display/lora-adapter-gguf-conversion-fails.md)
|
||||||
|
|
||||||
## 🌐 Networking & Web
|
## 🌐 Networking & Web
|
||||||
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md)
|
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md)
|
||||||
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](networking/fail2ban-imap-self-ban-mail-client.md)
|
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](networking/fail2ban-imap-self-ban-mail-client.md)
|
||||||
- [firewalld: Mail Ports Wiped After Reload](networking/firewalld-mail-ports-reset.md)
|
- [firewalld: Mail Ports Wiped After Reload](networking/firewalld-mail-ports-reset.md)
|
||||||
- [Tailscale SSH: Unexpected Re-Authentication Prompt](networking/tailscale-ssh-reauth-prompt.md)
|
- [Tailscale SSH: Unexpected Re-Authentication Prompt](networking/tailscale-ssh-reauth-prompt.md)
|
||||||
|
- [iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators](networking/tailscale-status-json-hostname-localhost-ios.md)
|
||||||
- [rsync over Tailscale: Hung in TCP Teardown After Transfer Completes](networking/rsync-tailscale-teardown-stall.md)
|
- [rsync over Tailscale: Hung in TCP Teardown After Transfer Completes](networking/rsync-tailscale-teardown-stall.md)
|
||||||
- [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md)
|
- [Windows OpenSSH: WSL Default Shell Breaks Remote Commands](networking/windows-openssh-wsl-default-shell-breaks-remote-commands.md)
|
||||||
- [Pi-hole AI Blocklist Blocks Claude Desktop (ERR_CONNECTION_REFUSED)](networking/pihole-blocks-claude-desktop.md)
|
- [Pi-hole AI Blocklist Blocks Claude Desktop (ERR_CONNECTION_REFUSED)](networking/pihole-blocks-claude-desktop.md)
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,17 @@
|
||||||
---
|
---
|
||||||
title: "ISP SNI Filtering & Caddy Troubleshooting"
|
title: ISP SNI Filtering & Caddy Troubleshooting
|
||||||
domain: troubleshooting
|
domain: troubleshooting
|
||||||
category: general
|
category: general
|
||||||
tags: [isp, sni, caddy, tls, dns, cloudflare]
|
tags:
|
||||||
|
- isp
|
||||||
|
- sni
|
||||||
|
- caddy
|
||||||
|
- tls
|
||||||
|
- dns
|
||||||
|
- cloudflare
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-02
|
created: 2026-04-02
|
||||||
updated: 2026-04-02
|
updated: 2026-04-30T13:07
|
||||||
---
|
---
|
||||||
# ISP SNI Filtering & Caddy Troubleshooting
|
# ISP SNI Filtering & Caddy Troubleshooting
|
||||||
|
|
||||||
|
|
@ -29,3 +35,89 @@ notes.majorshouse.com {
|
||||||
```
|
```
|
||||||
|
|
||||||
Once the hostname was changed to one without the "wiki" keyword, the TLS handshake completed successfully.
|
Once the hostname was changed to one without the "wiki" keyword, the TLS handshake completed successfully.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🔁 2026-04-30 Update — Stale A Record + Cloudflare Proxy Fix
|
||||||
|
|
||||||
|
The hostname rename held for ~4 weeks. On 2026-04-30 the wiki went down with a TLS handshake failure on `notes.majorshouse.com`. The on-the-spot framing was "ISP filter expanded to include 'notes'" — but Cloudflare DNS audit showed a different (and arguably worse) root cause: **the `notes` A record was pointing at `136.54.3.248`, an IP that is not majorlab's current home IP.** Whichever host responds at that address either does not run Caddy or does not know about the `notes.majorshouse.com` SNI, so the TLS handshake was rejected with `internal_error 80`.
|
||||||
|
|
||||||
|
### Re-diagnosis
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Cert + Caddy + mkdocs all healthy on majorlab
|
||||||
|
$ ssh majorlab 'systemctl is-active caddy; ss -tlnp | grep :443'
|
||||||
|
active
|
||||||
|
LISTEN 0 4096 *:443 users:(("caddy",pid=1549,fd=7))
|
||||||
|
|
||||||
|
# Loopback-served TLS works fine — cert valid Mar 11 → Jun 9 2026
|
||||||
|
$ ssh majorlab 'curl -sS -o /dev/null -w "%{http_code}\n" --resolve notes.majorshouse.com:443:127.0.0.1 https://notes.majorshouse.com/'
|
||||||
|
200
|
||||||
|
|
||||||
|
# External TLS handshake gets rejected with internal_error
|
||||||
|
$ openssl s_client -servername notes.majorshouse.com -connect 136.54.3.248:443
|
||||||
|
… SSL alert number 80 (internal_error) …
|
||||||
|
```
|
||||||
|
|
||||||
|
### The smoking-gun comparison
|
||||||
|
|
||||||
|
Other `*.majorshouse.com` services worked because they were CNAMEs to the apex, which resolves to majorlab's actual home IP:
|
||||||
|
|
||||||
|
| Subdomain | DNS shape | Final IP | Status |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `notes.majorshouse.com` | **A → `136.54.3.248`** (stale) | `136.54.3.248` (wrong host) | ❌ TLS rejected |
|
||||||
|
| `git.majorshouse.com` | CNAME → `majorshouse.com.` | `136.56.0.55` (majorlab) | ✅ |
|
||||||
|
| `n8n.majorshouse.com` | CNAME → `majorshouse.com.` | `136.56.0.55` (majorlab) | ✅ |
|
||||||
|
| `matrix.majorshouse.com` | CNAME → `majorshouse.com.` | `136.56.0.55` (majorlab) | ✅ |
|
||||||
|
|
||||||
|
None of the working subdomains were proxied through Cloudflare (`proxied=false` on all of them); they simply had the right IP. The `notes` A record was the only one pointing somewhere wrong — most likely a stale value from a prior ISP / IP change that never got cleaned up.
|
||||||
|
|
||||||
|
### ✅ Fix — switch `notes` to a Cloudflare-proxied CNAME
|
||||||
|
|
||||||
|
Rather than just correcting the A record (which would silently break again the next time the home IP changes), the fix is a CNAME to the apex with proxy on. That gives two protections in one move: it always tracks the apex (so home IP changes propagate automatically) and it puts the wiki behind Cloudflare's edge (so any future ISP-side weirdness like the original `wiki` SNI filter is also bypassed).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# via Cloudflare API (token from ansible-vault: vault_cloudflare_api_token)
|
||||||
|
PUT /zones/{ZONE_ID}/dns_records/{NOTES_RECORD_ID}
|
||||||
|
{
|
||||||
|
"type": "CNAME",
|
||||||
|
"name": "notes.majorshouse.com",
|
||||||
|
"content": "majorshouse.com",
|
||||||
|
"ttl": 1,
|
||||||
|
"proxied": true,
|
||||||
|
"comment": "switched A→CNAME proxied to bypass stale IP / ISP SNI filter"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Or via the dashboard:
|
||||||
|
|
||||||
|
1. Cloudflare → `majorshouse.com` zone → DNS → Records
|
||||||
|
2. Edit the `notes` record: Type `CNAME`, Target `majorshouse.com`, Proxy `Proxied` (orange cloud)
|
||||||
|
3. Save
|
||||||
|
|
||||||
|
External clients now hit Cloudflare edge IPs (`104.21.x.x` / `172.67.x.x`) which TLS-terminate at the edge and tunnel back to majorlab's apex IP. ACME on majorlab keeps working — Cloudflare passes the HTTP-01 challenge through on port 80. Caddy's `notes.majorshouse.com {}` block needs no change.
|
||||||
|
|
||||||
|
Verify (response should show `server: cloudflare` and `via: 1.0 Caddy`):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -sSI https://notes.majorshouse.com/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Why a Cloudflare-proxied CNAME is the durable shape
|
||||||
|
|
||||||
|
- **Apex follows the home IP automatically.** Update the apex A record once when the ISP changes; every subdomain inherits it without per-record fixes.
|
||||||
|
- **TLS handshake is offloaded to CF.** Any ISP-level SNI weirdness (the original `wiki` ban; theoretical future bans) becomes irrelevant — external clients SNI=`notes.majorshouse.com` to Cloudflare, which the ISP doesn't filter.
|
||||||
|
- **Free.** Cloudflare's free tier covers proxy + TLS termination.
|
||||||
|
|
||||||
|
### Audit checklist for any home-hosted `*.majorshouse.com` subdomain
|
||||||
|
|
||||||
|
- [ ] DNS record is a **CNAME** to `majorshouse.com.`, not an A record to a literal home IP.
|
||||||
|
- [ ] Cloudflare proxy (orange cloud, `proxied=true`) enabled on the record — at minimum for any subdomain where TLS reachability matters.
|
||||||
|
- [ ] Caddy entry on majorlab references the public hostname; `reverse_proxy` stays on the localhost port.
|
||||||
|
- [ ] HTTPS verified from outside the LAN (phone on cellular is sufficient) within the first hour after the change.
|
||||||
|
- [ ] If an A record is genuinely required (e.g. it must NOT go through CF), document why in the deploy notes for that service.
|
||||||
|
|
||||||
|
### Related
|
||||||
|
|
||||||
|
- [[majwiki-setup-and-pipeline]] — full wiki deploy pipeline; the DNS step there should reference this fix
|
||||||
|
- [[Network-Overview]] — fleet IP table
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,116 @@
|
||||||
|
---
|
||||||
|
title: iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators
|
||||||
|
domain: troubleshooting
|
||||||
|
category: networking
|
||||||
|
tags:
|
||||||
|
- tailscale
|
||||||
|
- ios
|
||||||
|
- postfix
|
||||||
|
- etc-hosts
|
||||||
|
- jq
|
||||||
|
status: published
|
||||||
|
created: 2026-04-29
|
||||||
|
updated: 2026-04-29
|
||||||
|
---
|
||||||
|
|
||||||
|
# iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
A homegrown script that builds an `/etc/hosts` block from `tailscale status --json` silently corrupted the file the moment any iOS device joined the tailnet. After the next run, services bound to `localhost` started failing.
|
||||||
|
|
||||||
|
On the affected host (`majordiscord`), Postfix refused to start with:
|
||||||
|
|
||||||
|
```
|
||||||
|
postfix: fatal: parameter inet_interfaces: no local interface found for 100.127.114.10
|
||||||
|
```
|
||||||
|
|
||||||
|
`/etc/hosts` looked fine at the top — `127.0.0.1 localhost` was still present — but inside the Tailscale-managed block:
|
||||||
|
|
||||||
|
```
|
||||||
|
# TAILSCALE_START
|
||||||
|
100.84.42.102 tttpod
|
||||||
|
100.110.197.17 majortoot
|
||||||
|
100.95.55.40 localhost <-- WRONG (this is an iPhone)
|
||||||
|
100.84.165.52 majormail
|
||||||
|
...
|
||||||
|
100.127.114.10 localhost <-- WRONG (this is an iPad)
|
||||||
|
# TAILSCALE_END
|
||||||
|
```
|
||||||
|
|
||||||
|
When Postfix resolved `localhost` (because `inet_interfaces = localhost` in `main.cf`), the **last matching entry** in `/etc/hosts` won — a Tailscale IP that doesn't exist on this host — and the daemon died on bind.
|
||||||
|
|
||||||
|
## Root Cause
|
||||||
|
|
||||||
|
The script used `.HostName` from the Tailscale JSON:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tailscale status --json \
|
||||||
|
| jq -r '.Peer[] | "\(.TailscaleIPs[0]) \(.HostName)"' \
|
||||||
|
>> "$TEMP_HOSTS"
|
||||||
|
```
|
||||||
|
|
||||||
|
iOS Tailscale clients (iPhone, iPad) **always report `HostName: "localhost"`** in the JSON. iOS doesn't expose the real device name to apps the way macOS/Linux/Windows do, so the Tailscale client falls back to the literal string `localhost`.
|
||||||
|
|
||||||
|
Inspect it directly:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ tailscale status --json | jq '.Peer[] | select(.OS == "iOS") | {DNSName, HostName, OS}'
|
||||||
|
{
|
||||||
|
"DNSName": "iphone171.tail7f2d9.ts.net.",
|
||||||
|
"HostName": "localhost",
|
||||||
|
"OS": "iOS"
|
||||||
|
}
|
||||||
|
{
|
||||||
|
"DNSName": "ipad166.tail7f2d9.ts.net.",
|
||||||
|
"HostName": "localhost",
|
||||||
|
"OS": "iOS"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Every iOS device contributes a line `<tailscale-ip> localhost` to `/etc/hosts`, hijacking the `localhost` lookup.
|
||||||
|
|
||||||
|
## Fix
|
||||||
|
|
||||||
|
Use `.DNSName` (the unique tailnet DNS name) and take the first dotted component instead of `.HostName`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tailscale status --json \
|
||||||
|
| jq -r '.Peer[] | "\(.TailscaleIPs[0]) \(.DNSName | rtrimstr(".") | split(".")[0])"' \
|
||||||
|
>> "$TEMP_HOSTS"
|
||||||
|
```
|
||||||
|
|
||||||
|
`DNSName` is always set, always unique, and produces clean labels like `iphone171`, `ipad166`, `majorlab`, etc.
|
||||||
|
|
||||||
|
After patching the script and re-running it:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ bash /root/update_tailscale_hosts.sh
|
||||||
|
$ systemctl restart postfix
|
||||||
|
$ systemctl is-active postfix
|
||||||
|
active
|
||||||
|
```
|
||||||
|
|
||||||
|
## Why It's Hard to Spot
|
||||||
|
|
||||||
|
- The corruption only triggers when an iOS device is in the tailnet — so the script "worked" for months.
|
||||||
|
- `/etc/hosts` files are commonly skimmed top-down. The bogus `localhost` line is buried in the Tailscale block, well below the legitimate `127.0.0.1 localhost` line, and looks superficially like a normal Tailscale entry.
|
||||||
|
- Postfix's error message names the IP, not `localhost`, so the connection to `/etc/hosts` isn't obvious.
|
||||||
|
- `getent hosts localhost` shows the *first* match (`127.0.0.1`), not the one Postfix's resolver actually picks for `inet_interfaces` lookup.
|
||||||
|
|
||||||
|
## Verification Checklist
|
||||||
|
|
||||||
|
If you suspect this on any host using a similar generator script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Any non-loopback "localhost" entries are bugs
|
||||||
|
grep -nE '^[0-9]+\..* localhost\s*$' /etc/hosts
|
||||||
|
|
||||||
|
# Look at iOS peers' HostName field
|
||||||
|
tailscale status --json | jq '.Peer[] | select(.OS == "iOS") | .HostName'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
- [[majordiscord]] — affected host (incident logged 2026-04-29)
|
||||||
|
- [[Network Overview]] — Tailscale fleet topology
|
||||||
|
|
@ -11,7 +11,7 @@ tags:
|
||||||
- powershell
|
- powershell
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-03
|
created: 2026-04-03
|
||||||
updated: 2026-04-22T09:20
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
|
|
||||||
# Windows OpenSSH: WSL as Default Shell Breaks Remote Commands
|
# Windows OpenSSH: WSL as Default Shell Breaks Remote Commands
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ tags:
|
||||||
- majorrig
|
- majorrig
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-02
|
created: 2026-04-02
|
||||||
updated: 2026-04-22T09:20
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# Windows OpenSSH Server (sshd) Stops After Reboot
|
# Windows OpenSSH Server (sshd) Stops After Reboot
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ tags:
|
||||||
- deno
|
- deno
|
||||||
status: published
|
status: published
|
||||||
created: 2026-04-02
|
created: 2026-04-02
|
||||||
updated: 2026-04-22T11:33
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# yt-dlp YouTube JS Challenge Fix (Fedora)
|
# yt-dlp YouTube JS Challenge Fix (Fedora)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@
|
||||||
title: MajorWiki Deployment Status
|
title: MajorWiki Deployment Status
|
||||||
status: deployed
|
status: deployed
|
||||||
project: MajorTwin
|
project: MajorTwin
|
||||||
updated: 2026-04-07T10:48
|
updated: 2026-04-30T05:30
|
||||||
created: 2026-04-02T16:10
|
created: 2026-04-02T16:10
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -79,6 +79,23 @@ git push
|
||||||
|
|
||||||
Gitea receives the push → fires webhook → majorlab pulls → MkDocs rebuilds → `notes.majorshouse.com` updates automatically.
|
Gitea receives the push → fires webhook → majorlab pulls → MkDocs rebuilds → `notes.majorshouse.com` updates automatically.
|
||||||
|
|
||||||
|
> [!tip] One-liner wrapper
|
||||||
|
> On MajorRig, the `~/bin/wiki-commit "msg"` helper runs `git pull --rebase --autostash` → `git add -A` → `git commit` → `git push` in one shot. Sidesteps fast-forward rejections from cowork pushes (e.g. MajorAir pushing in parallel) and the empty-credentials issue with HTTPS.
|
||||||
|
|
||||||
|
## 🔒 Pre-Commit Hook (in repo)
|
||||||
|
|
||||||
|
`.githooks/pre-commit` (tracked) blocks any commit that adds or renames a `*.md` article without a corresponding entry in `SUMMARY.md`. Bypass with `git commit --no-verify` if you genuinely need to.
|
||||||
|
|
||||||
|
**Per-clone setup** (one-time, per workstation that uses the repo):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd <wiki-repo>
|
||||||
|
git config core.hooksPath .githooks
|
||||||
|
git config pull.rebase true
|
||||||
|
```
|
||||||
|
|
||||||
|
The hooksPath line is required — git doesn't run hooks from a tracked directory by default. The `pull.rebase true` makes plain `git pull` always rebase locally, matching the `wiki-commit` wrapper's behavior.
|
||||||
|
|
||||||
## 📋 Wiki Maintenance Protocol
|
## 📋 Wiki Maintenance Protocol
|
||||||
|
|
||||||
Every time a new article is added, the following **MUST** be updated to maintain index integrity:
|
Every time a new article is added, the following **MUST** be updated to maintain index integrity:
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-04-06T09:52
|
created: 2026-04-06T09:52
|
||||||
updated: 2026-04-29T22:46
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# MajorLinux Tech Wiki — Index
|
# MajorLinux Tech Wiki — Index
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-04-02T16:03
|
created: 2026-04-02T16:03
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T11:24
|
||||||
---
|
---
|
||||||
* [Home](index.md)
|
* [Home](index.md)
|
||||||
* [Linux & Sysadmin](01-linux/index.md)
|
* [Linux & Sysadmin](01-linux/index.md)
|
||||||
|
|
@ -43,6 +43,7 @@ updated: 2026-04-29T22:45
|
||||||
* [Fail2ban Custom Jail: Apache 404 Scanner Detection](02-selfhosting/security/fail2ban-apache-404-scanner-jail.md)
|
* [Fail2ban Custom Jail: Apache 404 Scanner Detection](02-selfhosting/security/fail2ban-apache-404-scanner-jail.md)
|
||||||
* [Fail2ban Custom Jail: Apache PHP Webshell Probe Detection](02-selfhosting/security/fail2ban-apache-php-probe-jail.md)
|
* [Fail2ban Custom Jail: Apache PHP Webshell Probe Detection](02-selfhosting/security/fail2ban-apache-php-probe-jail.md)
|
||||||
* [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md)
|
* [Fail2ban Custom Jail: WordPress Login Brute Force](02-selfhosting/security/fail2ban-wordpress-login-jail.md)
|
||||||
|
* [wp-fail2ban Plugin Logpath on Debian/Ubuntu (auth.log not syslog)](02-selfhosting/security/wp-fail2ban-logpath-debian-ubuntu.md)
|
||||||
* [SELinux: Fixing Fail2ban grep execmem Denial](02-selfhosting/security/selinux-fail2ban-execmem-fix.md)
|
* [SELinux: Fixing Fail2ban grep execmem Denial](02-selfhosting/security/selinux-fail2ban-execmem-fix.md)
|
||||||
* [UFW Firewall Management](02-selfhosting/security/ufw-firewall-management.md)
|
* [UFW Firewall Management](02-selfhosting/security/ufw-firewall-management.md)
|
||||||
* [Firewall Hardening with firewalld on Fedora Fleet](02-selfhosting/security/firewalld-fleet-hardening.md)
|
* [Firewall Hardening with firewalld on Fedora Fleet](02-selfhosting/security/firewalld-fleet-hardening.md)
|
||||||
|
|
@ -77,6 +78,7 @@ updated: 2026-04-29T22:45
|
||||||
* [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md)
|
* [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md)
|
||||||
* [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md)
|
* [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md)
|
||||||
* [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md)
|
* [Qwen2.5-14B OOM on RTX 3080 Ti (12GB)](05-troubleshooting/gpu-display/qwen-14b-oom-3080ti.md)
|
||||||
|
* [LoRA adapter — GGUF conversion fails with 'config.json not found'](05-troubleshooting/gpu-display/lora-adapter-gguf-conversion-fails.md)
|
||||||
* [yt-dlp YouTube JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md)
|
* [yt-dlp YouTube JS Challenge Fix on Fedora](05-troubleshooting/yt-dlp-fedora-js-challenge.md)
|
||||||
* [Gemini CLI Manual Update](05-troubleshooting/gemini-cli-manual-update.md)
|
* [Gemini CLI Manual Update](05-troubleshooting/gemini-cli-manual-update.md)
|
||||||
* [MajorWiki Setup & Publishing Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md)
|
* [MajorWiki Setup & Publishing Pipeline](05-troubleshooting/majwiki-setup-and-pipeline.md)
|
||||||
|
|
@ -90,6 +92,7 @@ updated: 2026-04-29T22:45
|
||||||
* [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md)
|
* [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md)
|
||||||
* [Ollama: `ollama run` with Piped Stdin Bypasses Chat Template + SYSTEM Prompt](05-troubleshooting/ollama-chat-template-pipe-stdin-bypass.md)
|
* [Ollama: `ollama run` with Piped Stdin Bypasses Chat Template + SYSTEM Prompt](05-troubleshooting/ollama-chat-template-pipe-stdin-bypass.md)
|
||||||
* [rsync over Tailscale: Hung in TCP Teardown After Transfer Completes](05-troubleshooting/networking/rsync-tailscale-teardown-stall.md)
|
* [rsync over Tailscale: Hung in TCP Teardown After Transfer Completes](05-troubleshooting/networking/rsync-tailscale-teardown-stall.md)
|
||||||
|
* [iOS Tailscale Clients Report HostName="localhost" — Breaks /etc/hosts Generators](05-troubleshooting/networking/tailscale-status-json-hostname-localhost-ios.md)
|
||||||
* [macOS: Repeating Alert Tone from Mirrored iPhone Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md)
|
* [macOS: Repeating Alert Tone from Mirrored iPhone Notification](05-troubleshooting/macos-mirrored-notification-alert-loop.md)
|
||||||
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
|
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
|
||||||
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)
|
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)
|
||||||
|
|
|
||||||
2
index.md
2
index.md
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-04-06T09:52
|
created: 2026-04-06T09:52
|
||||||
updated: 2026-04-29T22:45
|
updated: 2026-04-30T05:21
|
||||||
---
|
---
|
||||||
# MajorLinux Tech Wiki — Index
|
# MajorLinux Tech Wiki — Index
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue