Files
MajorWiki/02-selfhosting/security/fail2ban-apache-404-scanner-jail.md
MajorLinux 6592eb4fea wiki: audit fixes — broken links, wikilinks, frontmatter, stale content (66 files)
- Fixed 4 broken markdown links (bad relative paths in See Also sections)
- Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment)
- Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3)
- Converted 67 Obsidian wikilinks to relative markdown links or plain text
- Added YAML frontmatter to 35 articles missing it entirely
- Completed frontmatter on 8 articles with missing fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:16:29 -04:00

128 lines
4.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Fail2ban Custom Jail: Apache 404 Scanner Detection"
domain: selfhosting
category: security
tags: [fail2ban, apache, security, scanner, firewall]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Fail2ban Custom Jail: Apache 404 Scanner Detection
## The Problem
Automated vulnerability scanners probe web servers by requesting dozens of common config file paths — `.env`, `env.php`, `next.config.js`, `nuxt.config.ts`, etc. — in rapid succession. These all return **404 Not Found**, which is correct behavior from Apache.
However, the built-in Fail2ban jails (`apache-noscript`, `apache-botsearch`) don't catch these because they parse the **error log**, not the **access log**. If Apache doesn't write a corresponding "File does not exist" entry to the error log for every 404, the scanner slips through undetected.
This also triggers false alerts in monitoring tools like **Netdata**, which sees the success ratio drop (e.g., `web_log_1m_successful` goes CRITICAL at 2.83%) because 404s aren't counted as successful responses.
## The Solution
Create a custom Fail2ban filter that reads the **access log** and matches 404 responses directly.
### Step 1 — Create the filter
Create `/etc/fail2ban/filter.d/apache-404scan.conf`:
```ini
# Fail2Ban filter to catch rapid 404 scanning in Apache access logs
# Targets vulnerability scanners probing for .env, config files, etc.
[Definition]
# Match 404 responses in combined/common access log format
failregex = ^<HOST> -.*"(GET|POST|HEAD|PUT|DELETE|OPTIONS|PATCH) .+" 404 \d+
ignoreregex = ^<HOST> -.*(robots\.txt|favicon\.ico|apple-touch-icon)
datepattern = %%d/%%b/%%Y:%%H:%%M:%%S %%z
```
### Step 2 — Add the jail
Add to `/etc/fail2ban/jail.local`:
```ini
[apache-404scan]
enabled = true
port = http,https
filter = apache-404scan
logpath = /var/log/apache2/access.log
maxretry = 10
findtime = 1m
bantime = 24h
backend = polling
```
**10 hits in 1 minute** is aggressive enough to catch scanners (which fire 3050+ requests in seconds) while avoiding false positives from a legitimate user hitting a few broken links.
> **Critical: `backend = polling` is required** if your `jail.local` or `jail.d/` sets `backend = systemd` in `[DEFAULT]` (common on Fedora/RHEL). Without it, fail2ban ignores the `logpath` and reads from journald instead — which Apache doesn't write to. The jail will appear active (`fail2ban-client status` shows it running) but `fail2ban-client get apache-404scan logpath` will return "No file is currently monitored" and zero IPs will ever be banned. This fails silently.
### Step 3 — Test the regex
```bash
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
```
You should see matches. In a real-world test against a server under active scanning, this matched **2831 out of 8901** access log lines.
### Step 4 — Reload Fail2ban
```bash
systemctl restart fail2ban
fail2ban-client status apache-404scan
```
## Why Default Jails Miss This
| Jail | Log Source | What It Matches | Why It Misses |
|---|---|---|---|
| `apache-noscript` | error log | "script not found or unable to stat" | Only matches script-type files (.php, .asp, .exe, .pl) |
| `apache-botsearch` | error log | "File does not exist" for specific paths | Requires Apache to write error log entries for 404s |
| **`apache-404scan`** | **access log** | **Any 404 response** | **Catches everything** |
The key insight: URL-encoded probes like `/%2f%2eenv%2econfig` that return 404 in the access log may not generate error log entries at all, making them invisible to the default filters.
## Pair With Recidive
If you have the `recidive` jail enabled, repeat offenders get permanently banned:
```ini
[recidive]
enabled = true
bantime = -1
findtime = 86400
maxretry = 3
```
Three 24-hour bans within a day = permanent firewall block.
## Quick Diagnostic Commands
```bash
# Test filter against current access log
fail2ban-regex /var/log/apache2/access.log /etc/fail2ban/filter.d/apache-404scan.conf
# Check jail status and banned IPs
fail2ban-client status apache-404scan
# IMPORTANT: verify the jail is actually monitoring the file
fail2ban-client get apache-404scan logpath
# Should show: /var/log/apache2/access.log
# If it shows "No file is currently monitored" — add backend = polling to the jail
# Watch bans in real time
tail -f /var/log/fail2ban.log | grep apache-404scan
# Count 404s in today's log
grep '" 404 ' /var/log/apache2/access.log | wc -l
```
## Key Notes
- The `ignoreregex` excludes `robots.txt`, `favicon.ico`, and `apple-touch-icon` — these are commonly requested and produce harmless 404s.
- Make sure your Tailscale subnet (`100.64.0.0/10`) is in the `ignoreip` list under `[DEFAULT]` so you don't ban your own monitoring or uptime checks.
- This filter works with both Apache **combined** and **common** log formats.
- Complements the existing `apache-dirscan` jail (which catches error-log-based directory enumeration). Use both for full coverage.