- Fixed 4 broken markdown links (bad relative paths in See Also sections) - Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment) - Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3) - Converted 67 Obsidian wikilinks to relative markdown links or plain text - Added YAML frontmatter to 35 articles missing it entirely - Completed frontmatter on 8 articles with missing fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
168 lines
4.6 KiB
Markdown
168 lines
4.6 KiB
Markdown
---
|
||
title: "Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS"
|
||
domain: troubleshooting
|
||
category: networking
|
||
tags: [fail2ban, ufw, nftables, vps, performance]
|
||
status: published
|
||
created: 2026-04-02
|
||
updated: 2026-04-02
|
||
---
|
||
# Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS
|
||
|
||
## 🛑 Problem
|
||
|
||
A small VPS (1–2 GB RAM) running Fail2ban with permanent bans (`bantime = -1`) gradually accumulates thousands of UFW DENY rules or nftables entries. Over time this causes:
|
||
|
||
- High memory usage from Fail2ban (100+ MB RSS)
|
||
- Bloated nftables ruleset (30k+ rules) — every incoming packet must traverse the full list
|
||
- Netdata alerts flapping on RAM/swap thresholds
|
||
- Degraded packet processing performance
|
||
|
||
---
|
||
|
||
## 🔍 Diagnosis
|
||
|
||
### Step 1 — Check Fail2ban memory and thread count
|
||
|
||
```bash
|
||
grep -E "VmRSS|VmSwap|Threads" /proc/$(pgrep -ox fail2ban-server)/status
|
||
```
|
||
|
||
On a small VPS, Fail2ban RSS over 80 MB is a red flag. Thread count scales with jail count (roughly 2 threads per jail + overhead).
|
||
|
||
---
|
||
|
||
### Step 2 — Count nftables/UFW rules
|
||
|
||
```bash
|
||
# Total drop/reject rules in nftables
|
||
nft list ruleset | grep -c "reject\|drop"
|
||
|
||
# UFW rule file size
|
||
wc -l /etc/ufw/user.rules
|
||
```
|
||
|
||
A healthy UFW setup has 10–30 rules. Thousands means manual `ufw deny` commands or permanent Fail2ban bans have accumulated.
|
||
|
||
---
|
||
|
||
### Step 3 — Identify dead jails
|
||
|
||
```bash
|
||
for jail in $(fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
|
||
total=$(fail2ban-client status $jail | grep "Total banned" | awk '{print $NF}')
|
||
echo "$jail: $total total bans"
|
||
done
|
||
```
|
||
|
||
Jails with zero total bans are dead weight — burning threads and regex cycles for nothing.
|
||
|
||
---
|
||
|
||
### Step 4 — Check ban policy
|
||
|
||
```bash
|
||
grep bantime /etc/fail2ban/jail.local
|
||
```
|
||
|
||
`bantime = -1` means permanent. On a public-facing server, scanner IPs rotate constantly — permanent bans just pile up with no benefit.
|
||
|
||
---
|
||
|
||
## ✅ Solution
|
||
|
||
### Fix 1 — Disable dead jails
|
||
|
||
Edit `/etc/fail2ban/jail.local` and set `enabled = false` for any jail with zero historical bans.
|
||
|
||
### Fix 2 — Switch to time-limited bans
|
||
|
||
```ini
|
||
[DEFAULT]
|
||
bantime = 30d
|
||
|
||
[recidive]
|
||
bantime = 90d
|
||
```
|
||
|
||
30 days is long enough to block active campaigns; repeat offenders get 90 days via recidive. Scanner IPs rarely persist beyond a week.
|
||
|
||
### Fix 3 — Flush accumulated bans
|
||
|
||
```bash
|
||
fail2ban-client unban --all
|
||
```
|
||
|
||
### Fix 4 — Reset bloated UFW rules
|
||
|
||
**Back up first:**
|
||
|
||
```bash
|
||
cp /etc/ufw/user.rules /etc/ufw/user.rules.bak
|
||
cp /etc/ufw/user6.rules /etc/ufw/user6.rules.bak
|
||
```
|
||
|
||
**Reset and re-add only legitimate ALLOW rules:**
|
||
|
||
```bash
|
||
ufw --force reset
|
||
ufw default deny incoming
|
||
ufw default allow outgoing
|
||
ufw allow 443/tcp
|
||
ufw allow 80/tcp
|
||
ufw allow in on tailscale0 to any port 22 comment "SSH via Tailscale"
|
||
# Add any other ALLOW rules specific to your server
|
||
ufw --force enable
|
||
```
|
||
|
||
**Restart Fail2ban** so it re-creates its nftables chains:
|
||
|
||
```bash
|
||
systemctl restart fail2ban
|
||
```
|
||
|
||
---
|
||
|
||
## 🔁 Why This Happens
|
||
|
||
| Cause | Effect |
|
||
|---|---|
|
||
| `bantime = -1` (permanent) | Banned IP list grows forever; nftables rules never expire |
|
||
| Manual `ufw deny from <IP>` | Each adds a persistent rule to `user.rules`; survives reboots |
|
||
| Many jails with no hits | Each jail spawns 2+ threads, runs regex against logs continuously |
|
||
| Small VPS (1–2 GB RAM) | Fail2ban + nftables overhead becomes significant fraction of total RAM |
|
||
|
||
---
|
||
|
||
## ⚠️ Key Notes
|
||
|
||
- **Deleting UFW rules one-by-one is impractical** at scale — `ufw delete` with 30k rules takes hours. A full reset + re-add is the only efficient path.
|
||
- **`ufw --force reset` also resets `before.rules` and `after.rules`** — UFW auto-backs these up, but verify your custom chains if any exist.
|
||
- **After flushing bans, expect a brief spike in 4xx responses** as scanners that were previously blocked hit Apache again. Fail2ban will re-ban them within minutes.
|
||
- **The Netdata `web_log_1m_successful` alert may fire** during this window — it will self-clear once bans repopulate.
|
||
|
||
---
|
||
|
||
## 🔎 Quick Diagnostic Commands
|
||
|
||
```bash
|
||
# Fail2ban memory usage
|
||
grep -E "VmRSS|VmSwap|Threads" /proc/$(pgrep -ox fail2ban-server)/status
|
||
|
||
# Count nftables rules
|
||
nft list ruleset | grep -c "reject\|drop"
|
||
|
||
# UFW rule count
|
||
ufw status numbered | tail -1
|
||
|
||
# List all jails with ban counts
|
||
for jail in $(fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
|
||
banned=$(fail2ban-client status $jail | grep "Currently banned" | awk '{print $NF}')
|
||
total=$(fail2ban-client status $jail | grep "Total banned" | awk '{print $NF}')
|
||
echo "$jail: $banned current / $total total"
|
||
done
|
||
|
||
# Flush all bans
|
||
fail2ban-client unban --all
|
||
```
|