Compare commits
2 Commits
c7c7c9e5be
...
d88a209e0b
| Author | SHA1 | Date | |
|---|---|---|---|
| d88a209e0b | |||
| 18d213f213 |
110
02-selfhosting/security/fail2ban-apache-404-scanner-jail.md
Normal file
110
02-selfhosting/security/fail2ban-apache-404-scanner-jail.md
Normal file
@@ -0,0 +1,110 @@
|
||||
# Fail2ban Custom Jail: Apache 404 Scanner Detection
|
||||
|
||||
## The Problem
|
||||
|
||||
Automated vulnerability scanners probe web servers by requesting dozens of common config file paths — `.env`, `env.php`, `next.config.js`, `nuxt.config.ts`, etc. — in rapid succession. These all return **404 Not Found**, which is correct behavior from Apache.
|
||||
|
||||
However, the built-in Fail2ban jails (`apache-noscript`, `apache-botsearch`) don't catch these because they parse the **error log**, not the **access log**. If Apache doesn't write a corresponding "File does not exist" entry to the error log for every 404, the scanner slips through undetected.
|
||||
|
||||
This also triggers false alerts in monitoring tools like **Netdata**, which sees the success ratio drop (e.g., `web_log_1m_successful` goes CRITICAL at 2.83%) because 404s aren't counted as successful responses.
|
||||
|
||||
## The Solution
|
||||
|
||||
Create a custom Fail2ban filter that reads the **access log** and matches 404 responses directly.
|
||||
|
||||
### Step 1 — Create the filter
|
||||
|
||||
Create `/etc/fail2ban/filter.d/apache-404scan.conf`:
|
||||
|
||||
```ini
|
||||
# Fail2Ban filter to catch rapid 404 scanning in Apache access logs
|
||||
# Targets vulnerability scanners probing for .env, config files, etc.
|
||||
|
||||
[Definition]
|
||||
|
||||
# Match 404 responses in combined/common access log format
|
||||
failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE|OPTIONS|PATCH) .+" 404 \d+
|
||||
|
||||
ignoreregex = ^<HOST> -.*(robots\.txt|favicon\.ico|apple-touch-icon)
|
||||
|
||||
datepattern = %%d/%%b/%%Y:%%H:%%M:%%S %%z
|
||||
```
|
||||
|
||||
### Step 2 — Add the jail
|
||||
|
||||
Add to `/etc/fail2ban/jail.local`:
|
||||
|
||||
```ini
|
||||
[apache-404scan]
|
||||
enabled = true
|
||||
port = http,https
|
||||
filter = apache-404scan
|
||||
logpath = /var/log/apache2/access.log
|
||||
maxretry = 10
|
||||
findtime = 1m
|
||||
bantime = 24h
|
||||
```
|
||||
|
||||
**10 hits in 1 minute** is aggressive enough to catch scanners (which fire 30–50+ requests in seconds) while avoiding false positives from a legitimate user hitting a few broken links.
|
||||
|
||||
### Step 3 — Test the regex
|
||||
|
||||
```bash
|
||||
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
|
||||
```
|
||||
|
||||
You should see matches. In a real-world test against a server under active scanning, this matched **2831 out of 8901** access log lines.
|
||||
|
||||
### Step 4 — Reload Fail2ban
|
||||
|
||||
```bash
|
||||
systemctl restart fail2ban
|
||||
fail2ban-client status apache-404scan
|
||||
```
|
||||
|
||||
## Why Default Jails Miss This
|
||||
|
||||
| Jail | Log Source | What It Matches | Why It Misses |
|
||||
|---|---|---|---|
|
||||
| `apache-noscript` | error log | "script not found or unable to stat" | Only matches script-type files (.php, .asp, .exe, .pl) |
|
||||
| `apache-botsearch` | error log | "File does not exist" for specific paths | Requires Apache to write error log entries for 404s |
|
||||
| **`apache-404scan`** | **access log** | **Any 404 response** | **Catches everything** |
|
||||
|
||||
The key insight: URL-encoded probes like `/%2f%2eenv%2econfig` that return 404 in the access log may not generate error log entries at all, making them invisible to the default filters.
|
||||
|
||||
## Pair With Recidive
|
||||
|
||||
If you have the `recidive` jail enabled, repeat offenders get permanently banned:
|
||||
|
||||
```ini
|
||||
[recidive]
|
||||
enabled = true
|
||||
bantime = -1
|
||||
findtime = 86400
|
||||
maxretry = 3
|
||||
```
|
||||
|
||||
Three 24-hour bans within a day = permanent firewall block.
|
||||
|
||||
## Quick Diagnostic Commands
|
||||
|
||||
```bash
|
||||
# Test filter against current access log
|
||||
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
|
||||
|
||||
# Check jail status and banned IPs
|
||||
fail2ban-client status apache-404scan
|
||||
|
||||
# Watch bans in real time
|
||||
tail -f /var/log/fail2ban.log | grep apache-404scan
|
||||
|
||||
# Count 404s in today's log
|
||||
grep '" 404 ' /var/log/apache2/access.log | wc -l
|
||||
```
|
||||
|
||||
## Key Notes
|
||||
|
||||
- The `ignoreregex` excludes `robots.txt`, `favicon.ico`, and `apple-touch-icon` — these are commonly requested and produce harmless 404s.
|
||||
- Make sure your Tailscale subnet (`100.64.0.0/10`) is in the `ignoreip` list under `[DEFAULT]` so you don't ban your own monitoring or uptime checks.
|
||||
- This filter works with both Apache **combined** and **common** log formats.
|
||||
- Complements the existing `apache-dirscan` jail (which catches error-log-based directory enumeration). Use both for full coverage.
|
||||
72
05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md
Normal file
72
05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md
Normal file
@@ -0,0 +1,72 @@
|
||||
---
|
||||
title: Ansible SSH Timeout During dnf upgrade on Fedora Hosts
|
||||
domain: troubleshooting
|
||||
category: ansible
|
||||
tags:
|
||||
- ansible
|
||||
- ssh
|
||||
- fedora
|
||||
- dnf
|
||||
- timeout
|
||||
- fleet-management
|
||||
status: published
|
||||
created: '2026-03-28'
|
||||
updated: '2026-03-28'
|
||||
---
|
||||
|
||||
# Ansible SSH Timeout During dnf upgrade on Fedora Hosts
|
||||
|
||||
## Symptom
|
||||
|
||||
Running `ansible-playbook update.yml` against Fedora/CentOS hosts fails with:
|
||||
|
||||
```
|
||||
fatal: [hostname]: UNREACHABLE! => {"changed": false,
|
||||
"msg": "Failed to connect to the host via ssh: Shared connection to <IP> closed."}
|
||||
```
|
||||
|
||||
The failure occurs specifically during `ansible.builtin.dnf` tasks that upgrade all packages (`name: '*'`, `state: latest`), because the operation takes long enough for the SSH connection to drop.
|
||||
|
||||
## Root Cause
|
||||
|
||||
Without explicit SSH keepalive settings in `ansible.cfg`, OpenSSH defaults apply. Long-running tasks like full `dnf upgrade` across a fleet can exceed idle timeouts, causing the control connection to close mid-task.
|
||||
|
||||
## Fix
|
||||
|
||||
Add a `[ssh_connection]` section to `ansible.cfg`:
|
||||
|
||||
```ini
|
||||
[ssh_connection]
|
||||
ssh_args = -o ServerAliveInterval=30 -o ServerAliveCountMax=10 -o ControlMaster=auto -o ControlPersist=60s
|
||||
```
|
||||
|
||||
| Setting | Purpose |
|
||||
|---------|---------|
|
||||
| `ServerAliveInterval=30` | Send a keepalive every 30 seconds |
|
||||
| `ServerAliveCountMax=10` | Allow 10 missed keepalives before disconnect (~5 min tolerance) |
|
||||
| `ControlMaster=auto` | Reuse SSH connections across tasks |
|
||||
| `ControlPersist=60s` | Keep the master connection open 60s after last use |
|
||||
|
||||
## Related Fix: do-agent Task Guard
|
||||
|
||||
In the same playbook run, a second failure surfaced on hosts where the `ansible.builtin.uri` task to fetch the latest `do-agent` release was **skipped** (non-RedHat hosts or hosts without do-agent installed). The registered variable existed but contained a skipped result with no `.json` attribute, causing:
|
||||
|
||||
```
|
||||
object of type 'dict' has no attribute 'json'
|
||||
```
|
||||
|
||||
Fix: add guards to downstream tasks that reference the URI result:
|
||||
|
||||
```yaml
|
||||
when:
|
||||
- do_agent_release is defined
|
||||
- do_agent_release is not skipped
|
||||
- do_agent_release.json is defined
|
||||
```
|
||||
|
||||
## Environment
|
||||
|
||||
- **Controller:** macOS (MajorAir)
|
||||
- **Targets:** Fedora 43 (majorlab, majormail, majorhome, majordiscord)
|
||||
- **Ansible:** community edition via Homebrew
|
||||
- **Committed:** `d9c6bdb` in MajorAnsible repo
|
||||
@@ -13,6 +13,10 @@ Practical fixes for common Linux, networking, and application problems.
|
||||
- [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md)
|
||||
- [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md)
|
||||
|
||||
## ⚙️ Ansible & Fleet Management
|
||||
- [SSH Timeout During dnf upgrade on Fedora Hosts](ansible-ssh-timeout-dnf-upgrade.md)
|
||||
- [Vault Password File Missing](ansible-vault-password-file-missing.md)
|
||||
|
||||
## 📦 Docker & Systems
|
||||
- [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](docker-caddy-selinux-post-reboot-recovery.md)
|
||||
- [Gitea Actions Runner: Boot Race Condition Fix](gitea-runner-boot-race-network-target.md)
|
||||
|
||||
@@ -26,6 +26,7 @@
|
||||
* [Netdata n8n Enriched Alert Emails](02-selfhosting/monitoring/netdata-n8n-enriched-alerts.md)
|
||||
* [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md)
|
||||
* [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md)
|
||||
* [Fail2ban Custom Jail: Apache 404 Scanner Detection](02-selfhosting/security/fail2ban-apache-404-scanner-jail.md)
|
||||
* [Open Source & Alternatives](03-opensource/index.md)
|
||||
* [SearXNG: Private Self-Hosted Search](03-opensource/alternatives/searxng.md)
|
||||
* [FreshRSS: Self-Hosted RSS Reader](03-opensource/alternatives/freshrss.md)
|
||||
@@ -61,3 +62,4 @@
|
||||
* [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md)
|
||||
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
|
||||
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)
|
||||
* [Ansible: SSH Timeout During dnf upgrade on Fedora Hosts](05-troubleshooting/ansible-ssh-timeout-dnf-upgrade.md)
|
||||
|
||||
Reference in New Issue
Block a user