Files
MajorWiki/02-selfhosting/security/fail2ban-apache-404-scanner-jail.md
majorlinux 23a35e021b wiki: add fail2ban apache 404 scanner jail article
New guide for custom access-log-based fail2ban jail that catches
rapid-fire 404 vulnerability scanners missed by default error-log jails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:22:19 -04:00

111 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Fail2ban Custom Jail: Apache 404 Scanner Detection
## The Problem
Automated vulnerability scanners probe web servers by requesting dozens of common config file paths — `.env`, `env.php`, `next.config.js`, `nuxt.config.ts`, etc. — in rapid succession. These all return **404 Not Found**, which is correct behavior from Apache.
However, the built-in Fail2ban jails (`apache-noscript`, `apache-botsearch`) don't catch these because they parse the **error log**, not the **access log**. If Apache doesn't write a corresponding "File does not exist" entry to the error log for every 404, the scanner slips through undetected.
This also triggers false alerts in monitoring tools like **Netdata**, which sees the success ratio drop (e.g., `web_log_1m_successful` goes CRITICAL at 2.83%) because 404s aren't counted as successful responses.
## The Solution
Create a custom Fail2ban filter that reads the **access log** and matches 404 responses directly.
### Step 1 — Create the filter
Create `/etc/fail2ban/filter.d/apache-404scan.conf`:
```ini
# Fail2Ban filter to catch rapid 404 scanning in Apache access logs
# Targets vulnerability scanners probing for .env, config files, etc.
[Definition]
# Match 404 responses in combined/common access log format
failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE|OPTIONS|PATCH) .+" 404 \d+
ignoreregex = ^<HOST> -.*(robots\.txt|favicon\.ico|apple-touch-icon)
datepattern = %%d/%%b/%%Y:%%H:%%M:%%S %%z
```
### Step 2 — Add the jail
Add to `/etc/fail2ban/jail.local`:
```ini
[apache-404scan]
enabled = true
port = http,https
filter = apache-404scan
logpath = /var/log/apache2/access.log
maxretry = 10
findtime = 1m
bantime = 24h
```
**10 hits in 1 minute** is aggressive enough to catch scanners (which fire 3050+ requests in seconds) while avoiding false positives from a legitimate user hitting a few broken links.
### Step 3 — Test the regex
```bash
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
```
You should see matches. In a real-world test against a server under active scanning, this matched **2831 out of 8901** access log lines.
### Step 4 — Reload Fail2ban
```bash
systemctl restart fail2ban
fail2ban-client status apache-404scan
```
## Why Default Jails Miss This
| Jail | Log Source | What It Matches | Why It Misses |
|---|---|---|---|
| `apache-noscript` | error log | "script not found or unable to stat" | Only matches script-type files (.php, .asp, .exe, .pl) |
| `apache-botsearch` | error log | "File does not exist" for specific paths | Requires Apache to write error log entries for 404s |
| **`apache-404scan`** | **access log** | **Any 404 response** | **Catches everything** |
The key insight: URL-encoded probes like `/%2f%2eenv%2econfig` that return 404 in the access log may not generate error log entries at all, making them invisible to the default filters.
## Pair With Recidive
If you have the `recidive` jail enabled, repeat offenders get permanently banned:
```ini
[recidive]
enabled = true
bantime = -1
findtime = 86400
maxretry = 3
```
Three 24-hour bans within a day = permanent firewall block.
## Quick Diagnostic Commands
```bash
# Test filter against current access log
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
# Check jail status and banned IPs
fail2ban-client status apache-404scan
# Watch bans in real time
tail -f /var/log/fail2ban.log | grep apache-404scan
# Count 404s in today's log
grep '" 404 ' /var/log/apache2/access.log | wc -l
```
## Key Notes
- The `ignoreregex` excludes `robots.txt`, `favicon.ico`, and `apple-touch-icon` — these are commonly requested and produce harmless 404s.
- Make sure your Tailscale subnet (`100.64.0.0/10`) is in the `ignoreip` list under `[DEFAULT]` so you don't ban your own monitoring or uptime checks.
- This filter works with both Apache **combined** and **common** log formats.
- Complements the existing `apache-dirscan` jail (which catches error-log-based directory enumeration). Use both for full coverage.