docs: add Docker & Caddy SELinux post-reboot recovery runbook

Add troubleshooting article covering the three-part failure mode on
Fedora with SELinux Enforcing: docker.socket disabled, ports 4443/8448
blocked, and httpd_can_network_connect off. Update index and SUMMARY.
This commit is contained in:
2026-03-12 17:58:00 -04:00
parent f256ecc482
commit ca81761cb3
3 changed files with 146 additions and 19 deletions

View File

@@ -0,0 +1,135 @@
# Docker & Caddy Recovery After Reboot (Fedora + SELinux)
## 🛑 Problem
After a system reboot on **majorlab** (Fedora 43, SELinux Enforcing), Docker containers and all Caddy-proxied services become unreachable. Browsers may show connection errors or 502 Bad Gateway responses.
## 🔍 Diagnosis
Three separate failures occur in sequence:
### 1. Docker fails to start
```bash
systemctl status docker.service
# → Active: inactive (dead)
# → Dependency failed for docker.service
systemctl status docker.socket
# → Active: failed (Result: resources)
# → Failed to create listening socket (/run/docker.sock): Invalid argument
```
**Cause:** `docker.socket` is disabled, so Docker's socket activation fails and `docker.service` never starts. All containers are down.
---
### 2. Caddy fails to bind ports
```bash
journalctl -u caddy -n 20
# → Error: listen tcp :4443: bind: permission denied
# → Error: listen tcp :8448: bind: permission denied
```
**Cause:** SELinux's `http_port_t` type does not include ports `4443` (Tailscale HTTPS) or `8448` (Matrix federation), so Caddy is denied when trying to bind them.
---
### 3. Caddy returns 502 Bad Gateway
Even after Caddy starts, all reverse proxied services return 502.
```bash
journalctl -u caddy | grep "permission denied"
# → dial tcp 127.0.0.1:<port>: connect: permission denied
```
**Cause:** The SELinux boolean `httpd_can_network_connect` is off, preventing Caddy from making outbound connections to upstream services.
---
## ✅ Solution
### Step 1 — Re-enable and start Docker
```bash
sudo systemctl enable docker.socket
sudo systemctl start docker.socket
sudo systemctl start docker.service
```
Verify containers are up:
```bash
sudo docker ps -a
```
---
### Step 2 — Add missing ports to SELinux http_port_t
```bash
sudo semanage port -m -t http_port_t -p tcp 4443
sudo semanage port -a -t http_port_t -p tcp 8448
```
Verify:
```bash
sudo semanage port -l | grep http_port_t
# Should include 4443 and 8448
```
---
### Step 3 — Enable httpd_can_network_connect
```bash
sudo setsebool -P httpd_can_network_connect on
```
The `-P` flag makes this persistent across reboots.
---
### Step 4 — Start Caddy
```bash
sudo systemctl restart caddy
systemctl is-active caddy
# → active
```
---
## 🔁 Why This Happens
| Issue | Root Cause |
|---|---|
| Docker down | `docker.socket` was disabled (not just stopped) — survives reboots until explicitly enabled |
| Port bind denied | SELinux requires non-standard ports to be explicitly added to `http_port_t` — this is not automatic on upgrades or reinstalls |
| 502 on all proxied services | `httpd_can_network_connect` defaults to `off` on Fedora — must be set once per installation |
---
## 🔎 Quick Diagnostic Commands
```bash
# Check Docker
systemctl status docker.socket docker.service
sudo docker ps -a
# Check Caddy
systemctl status caddy
journalctl -u caddy -n 30
# Check SELinux booleans
getsebool httpd_can_network_connect
# Check allowed HTTP ports
sudo semanage port -l | grep http_port_t
# Test upstream directly (bypass Caddy)
curl -sv http://localhost:8086
```

View File

@@ -6,3 +6,4 @@ Practical fixes for common Linux, networking, and application problems.
- [Obsidian Cache Hang Recovery](obsidian-cache-hang-recovery.md)
- [yt-dlp Fedora JS Challenge](yt-dlp-fedora-js-challenge.md)
- [MajorWiki Setup & Publishing Pipeline](majwiki-setup-and-pipeline.md)
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)