vault backup: 2026-03-13 01:31:25
This commit is contained in:
135
05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md
Normal file
135
05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# Docker & Caddy Recovery After Reboot (Fedora + SELinux)
|
||||
|
||||
## 🛑 Problem
|
||||
|
||||
After a system reboot on **majorlab** (Fedora 43, SELinux Enforcing), Docker containers and all Caddy-proxied services become unreachable. Browsers may show connection errors or 502 Bad Gateway responses.
|
||||
|
||||
## 🔍 Diagnosis
|
||||
|
||||
Three separate failures occur in sequence:
|
||||
|
||||
### 1. Docker fails to start
|
||||
|
||||
```bash
|
||||
systemctl status docker.service
|
||||
# → Active: inactive (dead)
|
||||
# → Dependency failed for docker.service
|
||||
|
||||
systemctl status docker.socket
|
||||
# → Active: failed (Result: resources)
|
||||
# → Failed to create listening socket (/run/docker.sock): Invalid argument
|
||||
```
|
||||
|
||||
**Cause:** `docker.socket` is disabled, so Docker's socket activation fails and `docker.service` never starts. All containers are down.
|
||||
|
||||
---
|
||||
|
||||
### 2. Caddy fails to bind ports
|
||||
|
||||
```bash
|
||||
journalctl -u caddy -n 20
|
||||
# → Error: listen tcp :4443: bind: permission denied
|
||||
# → Error: listen tcp :8448: bind: permission denied
|
||||
```
|
||||
|
||||
**Cause:** SELinux's `http_port_t` type does not include ports `4443` (Tailscale HTTPS) or `8448` (Matrix federation), so Caddy is denied when trying to bind them.
|
||||
|
||||
---
|
||||
|
||||
### 3. Caddy returns 502 Bad Gateway
|
||||
|
||||
Even after Caddy starts, all reverse proxied services return 502.
|
||||
|
||||
```bash
|
||||
journalctl -u caddy | grep "permission denied"
|
||||
# → dial tcp 127.0.0.1:<port>: connect: permission denied
|
||||
```
|
||||
|
||||
**Cause:** The SELinux boolean `httpd_can_network_connect` is off, preventing Caddy from making outbound connections to upstream services.
|
||||
|
||||
---
|
||||
|
||||
## ✅ Solution
|
||||
|
||||
### Step 1 — Re-enable and start Docker
|
||||
|
||||
```bash
|
||||
sudo systemctl enable docker.socket
|
||||
sudo systemctl start docker.socket
|
||||
sudo systemctl start docker.service
|
||||
```
|
||||
|
||||
Verify containers are up:
|
||||
|
||||
```bash
|
||||
sudo docker ps -a
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 2 — Add missing ports to SELinux http_port_t
|
||||
|
||||
```bash
|
||||
sudo semanage port -m -t http_port_t -p tcp 4443
|
||||
sudo semanage port -a -t http_port_t -p tcp 8448
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
sudo semanage port -l | grep http_port_t
|
||||
# Should include 4443 and 8448
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 3 — Enable httpd_can_network_connect
|
||||
|
||||
```bash
|
||||
sudo setsebool -P httpd_can_network_connect on
|
||||
```
|
||||
|
||||
The `-P` flag makes this persistent across reboots.
|
||||
|
||||
---
|
||||
|
||||
### Step 4 — Start Caddy
|
||||
|
||||
```bash
|
||||
sudo systemctl restart caddy
|
||||
systemctl is-active caddy
|
||||
# → active
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔁 Why This Happens
|
||||
|
||||
| Issue | Root Cause |
|
||||
|---|---|
|
||||
| Docker down | `docker.socket` was disabled (not just stopped) — survives reboots until explicitly enabled |
|
||||
| Port bind denied | SELinux requires non-standard ports to be explicitly added to `http_port_t` — this is not automatic on upgrades or reinstalls |
|
||||
| 502 on all proxied services | `httpd_can_network_connect` defaults to `off` on Fedora — must be set once per installation |
|
||||
|
||||
---
|
||||
|
||||
## 🔎 Quick Diagnostic Commands
|
||||
|
||||
```bash
|
||||
# Check Docker
|
||||
systemctl status docker.socket docker.service
|
||||
sudo docker ps -a
|
||||
|
||||
# Check Caddy
|
||||
systemctl status caddy
|
||||
journalctl -u caddy -n 30
|
||||
|
||||
# Check SELinux booleans
|
||||
getsebool httpd_can_network_connect
|
||||
|
||||
# Check allowed HTTP ports
|
||||
sudo semanage port -l | grep http_port_t
|
||||
|
||||
# Test upstream directly (bypass Caddy)
|
||||
curl -sv http://localhost:8086
|
||||
```
|
||||
@@ -6,3 +6,4 @@ Practical fixes for common Linux, networking, and application problems.
|
||||
- [Obsidian Cache Hang Recovery](obsidian-cache-hang-recovery.md)
|
||||
- [yt-dlp Fedora JS Challenge](yt-dlp-fedora-js-challenge.md)
|
||||
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
|
||||
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)
|
||||
|
||||
@@ -1,129 +1,22 @@
|
||||
---
|
||||
title: ISP SNI Filtering Blocking Caddy Reverse Proxy
|
||||
domain: troubleshooting
|
||||
category: networking
|
||||
tags:
|
||||
- caddy
|
||||
- tls
|
||||
- sni
|
||||
- isp
|
||||
- google-fiber
|
||||
- reverse-proxy
|
||||
- troubleshooting
|
||||
status: published
|
||||
created: '2026-03-11'
|
||||
updated: '2026-03-11'
|
||||
---
|
||||
# ISP SNI Filtering & Caddy Troubleshooting
|
||||
|
||||
# ISP SNI Filtering Blocking Caddy Reverse Proxy
|
||||
## 🛑 Problem
|
||||
When deploying the MajorWiki at `wiki.majorshouse.com`, the site was unreachable over HTTPS. Browsers reported a `TLS_CONNECTION_REFUSED` error.
|
||||
|
||||
Some ISPs — including Google Fiber — silently block TLS handshakes for certain hostnames at the network level. The connection reaches your server, TCP completes, but the TLS handshake never finishes. The symptom looks identical to a misconfigured Caddy setup or a missing certificate, which makes it a frustrating thing to debug.
|
||||
## 🔍 Diagnosis
|
||||
1. **Direct IP Check:** Accessing the server via IP on port 8092 worked fine.
|
||||
2. **Tailscale Check:** Accessing via the Tailscale magic DNS worked fine.
|
||||
3. **SNI Analysis:** Using `openssl s_client -connect <IP>:443 -servername wiki.majorshouse.com` resulted in an immediate reset by peer.
|
||||
4. **Root Cause:** Google Fiber (the local ISP) appears to be performing SNI-based filtering on hostnames containing the string "wiki".
|
||||
|
||||
## What Happened
|
||||
## ✅ Solution
|
||||
The domain was changed from `wiki.majorshouse.com` to `notes.majorshouse.com`.
|
||||
|
||||
Deployed a new Caddy vhost for `wiki.majorshouse.com` on a Google Fiber residential connection. Everything on the server was correct:
|
||||
|
||||
- Let's Encrypt cert provisioned successfully
|
||||
- Caddy validated clean with `caddy validate`
|
||||
- `curl --resolve wiki.majorshouse.com:443:127.0.0.1 https://wiki.majorshouse.com` returned 200 from loopback
|
||||
- iptables had ACCEPT rules for ports 80 and 443
|
||||
- All other Caddy vhosts on the same IP and port worked fine externally
|
||||
|
||||
But from any external host, `curl` timed out with no response. `ss -tn` showed SYN-RECV connections piling up on port 443 — the TCP handshake was completing, but the TLS handshake was stalling.
|
||||
|
||||
## The Debugging Sequence
|
||||
|
||||
**Step 1: Ruled out Caddy config issues**
|
||||
|
||||
```bash
|
||||
caddy validate --config /etc/caddy/Caddyfile
|
||||
curl --resolve wiki.majorshouse.com:443:127.0.0.1 https://wiki.majorshouse.com
|
||||
### Caddy Configuration Update
|
||||
```caddy
|
||||
notes.majorshouse.com {
|
||||
reverse_proxy :8092
|
||||
}
|
||||
```
|
||||
|
||||
Both clean. Loopback returned 200.
|
||||
|
||||
**Step 2: Ruled out certificate issues**
|
||||
|
||||
```bash
|
||||
ls /var/lib/caddy/.local/share/caddy/certificates/acme-v02.api.letsencrypt.org-directory/wiki.majorshouse.com/
|
||||
openssl x509 -in wiki.majorshouse.com.crt -noout -text | grep -E "Subject:|Not Before|Not After"
|
||||
```
|
||||
|
||||
Valid cert, correct subject, not expired.
|
||||
|
||||
**Step 3: Ruled out firewall**
|
||||
|
||||
```bash
|
||||
iptables -L INPUT -n -v | grep -E "80|443"
|
||||
ss -tlnp | grep ':443'
|
||||
```
|
||||
|
||||
Ports open, Caddy listening on `*:443`.
|
||||
|
||||
**Step 4: Ruled out hairpin NAT**
|
||||
|
||||
Testing `curl https://wiki.majorshouse.com` from the server itself returned "No route to host" — the server can't reach its own public IP. This is normal for residential connections without NAT loopback. It's not the problem.
|
||||
|
||||
**Step 5: Confirmed external connectivity on port 443**
|
||||
|
||||
```bash
|
||||
# From an external server (majormail)
|
||||
curl -sk -o /dev/null -w "%{http_code}" https://git.majorshouse.com # 200
|
||||
curl -sk -o /dev/null -w "%{http_code}" https://wiki.majorshouse.com # 000
|
||||
```
|
||||
|
||||
Same IP, same port, same Caddy process. `git` works, `wiki` doesn't.
|
||||
|
||||
**Step 6: Tested a different subdomain**
|
||||
|
||||
Added `notes.majorshouse.com` as a new Caddyfile entry pointing to the same upstream. Cert provisioned via HTTP-01 challenge successfully (proving port 80 is reachable). Then:
|
||||
|
||||
```bash
|
||||
curl -sk -o /dev/null -w "%{http_code}" https://notes.majorshouse.com # 200
|
||||
curl -sk -o /dev/null -w "%{http_code}" https://wiki.majorshouse.com # 000
|
||||
```
|
||||
|
||||
`notes` worked immediately. `wiki` still timed out.
|
||||
|
||||
**Conclusion:** Google Fiber is performing SNI-based filtering and blocking TLS connections where the ClientHello contains `wiki.majorshouse.com` as the server name.
|
||||
|
||||
## The Fix
|
||||
|
||||
Rename the subdomain. Use anything that doesn't trigger the filter. `notes.majorshouse.com` works fine.
|
||||
|
||||
```bash
|
||||
# Remove the blocked entry
|
||||
sed -i '/^wiki\.majorshouse\.com/,/^}/d' /etc/caddy/Caddyfile
|
||||
systemctl reload caddy
|
||||
```
|
||||
|
||||
Update `mkdocs.yml` or whatever service's config references the domain, add DNS for the new subdomain, and done.
|
||||
|
||||
## How to Diagnose This Yourself
|
||||
|
||||
If your Caddy vhost works on loopback but times out externally:
|
||||
|
||||
1. Confirm other vhosts on the same IP and port work externally
|
||||
2. Test the specific domain from multiple external networks (different ISP, mobile data)
|
||||
3. Add a second vhost with a different subdomain pointing to the same upstream
|
||||
4. If the new subdomain works and the original doesn't, the hostname is being filtered
|
||||
|
||||
```bash
|
||||
# Quick external test — run from a server outside your network
|
||||
curl -sk -o /dev/null -w "%{http_code}" --max-time 10 https://your-domain.com
|
||||
```
|
||||
|
||||
If you get `000` (connection timeout, not a TLS error like `curl: (35)`), the TCP connection isn't completing — pointing to network-level blocking rather than a Caddy or cert issue.
|
||||
|
||||
## Gotchas & Notes
|
||||
|
||||
- **`curl: (35) TLS error` is different from `000`.** A TLS error means TCP connected but the handshake failed — usually a missing or invalid cert. A `000` timeout means TCP never completed — a network or firewall issue.
|
||||
- **SYN-RECV in `ss -tn` means TCP is partially open.** If you see SYN-RECV entries for your domain but the connection never moves to ESTAB, something between the client and your TLS stack is dropping the handshake.
|
||||
- **ISP SNI filtering is uncommon but real.** Residential ISPs sometimes filter on SNI for terms associated with piracy, proxies, or certain categories of content. "Wiki" may trigger a content-type heuristic.
|
||||
- **Loopback testing isn't enough.** Always test from an external host before declaring a service working. The server can't test its own public IP on most residential connections.
|
||||
|
||||
## See Also
|
||||
|
||||
- [[setting-up-caddy-reverse-proxy]]
|
||||
- [[linux-server-hardening-checklist]]
|
||||
- [[tailscale-homelab-remote-access]]
|
||||
Once the hostname was changed to one without the "wiki" keyword, the TLS handshake completed successfully.
|
||||
|
||||
186
05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md
Normal file
186
05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md
Normal file
@@ -0,0 +1,186 @@
|
||||
# Apache Outage: Fail2ban Self-Ban + Missing iptables Rules
|
||||
|
||||
## 🛑 Problem
|
||||
|
||||
A web server running Apache2 becomes completely unreachable (`ERR_CONNECTION_TIMED_OUT`) despite Apache running normally. SSH access via Tailscale is unaffected.
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Diagnosis
|
||||
|
||||
### Step 1 — Confirm Apache is running
|
||||
|
||||
```bash
|
||||
sudo systemctl status apache2
|
||||
```
|
||||
|
||||
If Apache is `active (running)`, the problem is at the firewall layer, not the application.
|
||||
|
||||
---
|
||||
|
||||
### Step 2 — Test the public IP directly
|
||||
|
||||
```bash
|
||||
curl -I --max-time 5 http://<PUBLIC_IP>
|
||||
```
|
||||
|
||||
A **timeout** means traffic is being dropped by the firewall. A **connection refused** means Apache is down.
|
||||
|
||||
---
|
||||
|
||||
### Step 3 — Check the iptables INPUT chain
|
||||
|
||||
```bash
|
||||
sudo iptables -L INPUT -n -v
|
||||
```
|
||||
|
||||
Look for ACCEPT rules on ports 80 and 443. If they're missing and the chain policy is `DROP`, HTTP/HTTPS traffic is being silently dropped.
|
||||
|
||||
**Example of broken state:**
|
||||
```
|
||||
Chain INPUT (policy DROP)
|
||||
ACCEPT tcp -- lo * ... # loopback only
|
||||
ACCEPT tcp -- tailscale0 * ... tcp dpt:22
|
||||
# no rules for port 80 or 443
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 4 — Check the nftables ruleset for Fail2ban
|
||||
|
||||
```bash
|
||||
sudo nft list tables
|
||||
```
|
||||
|
||||
Look for `table inet f2b-table` — this is Fail2ban's nftables table. It operates at **priority `filter - 1`**, meaning it is evaluated *before* the main iptables INPUT chain.
|
||||
|
||||
```bash
|
||||
sudo nft list ruleset | grep -A 10 'f2b-table'
|
||||
```
|
||||
|
||||
Fail2ban rejects banned IPs with rules like:
|
||||
```
|
||||
tcp dport { 80, 443 } ip saddr @addr-set-wordpress-hard reject with icmp port-unreachable
|
||||
```
|
||||
|
||||
A banned admin IP will be rejected here regardless of any ACCEPT rules downstream.
|
||||
|
||||
---
|
||||
|
||||
### Step 5 — Check if your IP is banned
|
||||
|
||||
```bash
|
||||
for jail in $(sudo fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
|
||||
echo "=== $jail ==="; sudo fail2ban-client get $jail banip | tr ',' '\n' | grep <YOUR_IP>
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✅ Solution
|
||||
|
||||
### Fix 1 — Add missing iptables ACCEPT rules for HTTP/HTTPS
|
||||
|
||||
If ports 80/443 are absent from the INPUT chain:
|
||||
|
||||
```bash
|
||||
sudo iptables -I INPUT -i eth0 -p tcp --dport 80 -j ACCEPT
|
||||
sudo iptables -I INPUT -i eth0 -p tcp --dport 443 -j ACCEPT
|
||||
```
|
||||
|
||||
Persist the rules:
|
||||
|
||||
```bash
|
||||
sudo netfilter-persistent save
|
||||
```
|
||||
|
||||
If `netfilter-persistent` is not installed:
|
||||
|
||||
```bash
|
||||
sudo apt install -y iptables-persistent
|
||||
sudo netfilter-persistent save
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Fix 2 — Unban your IP from all Fail2ban jails
|
||||
|
||||
```bash
|
||||
for jail in $(sudo fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
|
||||
sudo fail2ban-client set $jail unbanip <YOUR_IP> 2>/dev/null && echo "Unbanned from $jail"
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Fix 3 — Add your IP to Fail2ban's ignore list
|
||||
|
||||
Edit `/etc/fail2ban/jail.local`:
|
||||
|
||||
```bash
|
||||
sudo nano /etc/fail2ban/jail.local
|
||||
```
|
||||
|
||||
Add or update the `[DEFAULT]` section:
|
||||
|
||||
```ini
|
||||
[DEFAULT]
|
||||
ignoreip = 127.0.0.1/8 ::1 <YOUR_IP>
|
||||
```
|
||||
|
||||
Restart Fail2ban:
|
||||
|
||||
```bash
|
||||
sudo systemctl restart fail2ban
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔁 Why This Happens
|
||||
|
||||
| Issue | Root Cause |
|
||||
|---|---|
|
||||
| Missing port 80/443 rules | iptables INPUT chain left incomplete after a manual firewall rework (e.g., SSH lockdown) |
|
||||
| Still blocked after adding iptables rules | Fail2ban uses a separate nftables table at higher priority — iptables ACCEPT rules are never reached for banned IPs |
|
||||
| Admin IP gets banned | Automated WordPress/Apache probes trigger Fail2ban jails against the admin's own IP |
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Key Architecture Note
|
||||
|
||||
On servers running both iptables and Fail2ban, the evaluation order is:
|
||||
|
||||
1. **`inet f2b-table`** (nftables, priority `filter - 1`) — Fail2ban ban sets; evaluated first
|
||||
2. **`ip filter` INPUT chain** (iptables/nftables, policy DROP) — explicit ACCEPT rules
|
||||
3. **UFW chains** — IP-specific rules; evaluated last
|
||||
|
||||
A banned IP is stopped at step 1 and never reaches the ACCEPT rules in step 2. Always check Fail2ban *after* confirming iptables looks correct.
|
||||
|
||||
---
|
||||
|
||||
## 🔎 Quick Diagnostic Commands
|
||||
|
||||
```bash
|
||||
# Check Apache
|
||||
sudo systemctl status apache2
|
||||
|
||||
# Test public connectivity
|
||||
curl -I --max-time 5 http://<PUBLIC_IP>
|
||||
|
||||
# Check iptables INPUT chain
|
||||
sudo iptables -L INPUT -n -v
|
||||
|
||||
# List nftables tables (look for inet f2b-table)
|
||||
sudo nft list tables
|
||||
|
||||
# Check Fail2ban jail status
|
||||
sudo fail2ban-client status
|
||||
|
||||
# Check a specific jail's banned IPs
|
||||
sudo fail2ban-client status wordpress-hard
|
||||
|
||||
# Unban an IP from all jails
|
||||
for jail in $(sudo fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
|
||||
sudo fail2ban-client set $jail unbanip <YOUR_IP> 2>/dev/null && echo "Unbanned from $jail"
|
||||
done
|
||||
```
|
||||
@@ -135,3 +135,42 @@ This is a YouTube-side experiment. yt-dlp falls back to other clients automatica
|
||||
yt-dlp --version
|
||||
pip show yt-dlp
|
||||
```
|
||||
|
||||
### Format Not Available: Strict AVC+M4A Selector
|
||||
|
||||
The format selector `bestvideo[vcodec^=avc]+bestaudio[ext=m4a]` will hard-fail if YouTube doesn't serve H.264 (AVC) video for a given video:
|
||||
|
||||
```
|
||||
ERROR: [youtube] Requested format is not available. Use --list-formats for a list of available formats
|
||||
```
|
||||
|
||||
This is separate from the n-challenge issue — the format simply doesn't exist for that video (common with newer uploads that are VP9/AV1-only).
|
||||
|
||||
**Fix 1 — Relax the selector to mp4 container without enforcing codec:**
|
||||
|
||||
```bash
|
||||
yt-dlp -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/bestvideo+bestaudio' \
|
||||
--merge-output-format mp4 \
|
||||
-o "/plex/plex/%(title)s.%(ext)s" \
|
||||
--write-auto-subs --embed-subs \
|
||||
https://youtu.be/VIDEO_ID
|
||||
```
|
||||
|
||||
**Fix 2 — Let yt-dlp pick best and re-encode to H.264 via ffmpeg (Plex-safe, slower):**
|
||||
|
||||
```bash
|
||||
yt-dlp -f 'bestvideo+bestaudio' \
|
||||
--merge-output-format mp4 \
|
||||
--recode-video mp4 \
|
||||
-o "/plex/plex/%(title)s.%(ext)s" \
|
||||
--write-auto-subs --embed-subs \
|
||||
https://youtu.be/VIDEO_ID
|
||||
```
|
||||
|
||||
Use `--recode-video mp4` when Plex direct play is required and the source stream may be VP9/AV1. Requires ffmpeg.
|
||||
|
||||
**Inspect available formats first:**
|
||||
|
||||
```bash
|
||||
yt-dlp --list-formats https://youtu.be/VIDEO_ID
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user