Files
MajorWiki/05-troubleshooting/networking/fail2ban-ufw-rule-bloat-cleanup.md
MajorLinux 6592eb4fea wiki: audit fixes — broken links, wikilinks, frontmatter, stale content (66 files)
- Fixed 4 broken markdown links (bad relative paths in See Also sections)
- Corrected n8n port binding to 127.0.0.1:5678 (matches actual deployment)
- Updated SnapRAID article with actual majorhome paths (/majorRAID, disk1-3)
- Converted 67 Obsidian wikilinks to relative markdown links or plain text
- Added YAML frontmatter to 35 articles missing it entirely
- Completed frontmatter on 8 articles with missing fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:16:29 -04:00

168 lines
4.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS"
domain: troubleshooting
category: networking
tags: [fail2ban, ufw, nftables, vps, performance]
status: published
created: 2026-04-02
updated: 2026-04-02
---
# Fail2ban & UFW Rule Bloat: 30k Rules Slowing Down a VPS
## 🛑 Problem
A small VPS (12 GB RAM) running Fail2ban with permanent bans (`bantime = -1`) gradually accumulates thousands of UFW DENY rules or nftables entries. Over time this causes:
- High memory usage from Fail2ban (100+ MB RSS)
- Bloated nftables ruleset (30k+ rules) — every incoming packet must traverse the full list
- Netdata alerts flapping on RAM/swap thresholds
- Degraded packet processing performance
---
## 🔍 Diagnosis
### Step 1 — Check Fail2ban memory and thread count
```bash
grep -E "VmRSS|VmSwap|Threads" /proc/$(pgrep -ox fail2ban-server)/status
```
On a small VPS, Fail2ban RSS over 80 MB is a red flag. Thread count scales with jail count (roughly 2 threads per jail + overhead).
---
### Step 2 — Count nftables/UFW rules
```bash
# Total drop/reject rules in nftables
nft list ruleset | grep -c "reject\|drop"
# UFW rule file size
wc -l /etc/ufw/user.rules
```
A healthy UFW setup has 1030 rules. Thousands means manual `ufw deny` commands or permanent Fail2ban bans have accumulated.
---
### Step 3 — Identify dead jails
```bash
for jail in $(fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
total=$(fail2ban-client status $jail | grep "Total banned" | awk '{print $NF}')
echo "$jail: $total total bans"
done
```
Jails with zero total bans are dead weight — burning threads and regex cycles for nothing.
---
### Step 4 — Check ban policy
```bash
grep bantime /etc/fail2ban/jail.local
```
`bantime = -1` means permanent. On a public-facing server, scanner IPs rotate constantly — permanent bans just pile up with no benefit.
---
## ✅ Solution
### Fix 1 — Disable dead jails
Edit `/etc/fail2ban/jail.local` and set `enabled = false` for any jail with zero historical bans.
### Fix 2 — Switch to time-limited bans
```ini
[DEFAULT]
bantime = 30d
[recidive]
bantime = 90d
```
30 days is long enough to block active campaigns; repeat offenders get 90 days via recidive. Scanner IPs rarely persist beyond a week.
### Fix 3 — Flush accumulated bans
```bash
fail2ban-client unban --all
```
### Fix 4 — Reset bloated UFW rules
**Back up first:**
```bash
cp /etc/ufw/user.rules /etc/ufw/user.rules.bak
cp /etc/ufw/user6.rules /etc/ufw/user6.rules.bak
```
**Reset and re-add only legitimate ALLOW rules:**
```bash
ufw --force reset
ufw default deny incoming
ufw default allow outgoing
ufw allow 443/tcp
ufw allow 80/tcp
ufw allow in on tailscale0 to any port 22 comment "SSH via Tailscale"
# Add any other ALLOW rules specific to your server
ufw --force enable
```
**Restart Fail2ban** so it re-creates its nftables chains:
```bash
systemctl restart fail2ban
```
---
## 🔁 Why This Happens
| Cause | Effect |
|---|---|
| `bantime = -1` (permanent) | Banned IP list grows forever; nftables rules never expire |
| Manual `ufw deny from <IP>` | Each adds a persistent rule to `user.rules`; survives reboots |
| Many jails with no hits | Each jail spawns 2+ threads, runs regex against logs continuously |
| Small VPS (12 GB RAM) | Fail2ban + nftables overhead becomes significant fraction of total RAM |
---
## ⚠️ Key Notes
- **Deleting UFW rules one-by-one is impractical** at scale — `ufw delete` with 30k rules takes hours. A full reset + re-add is the only efficient path.
- **`ufw --force reset` also resets `before.rules` and `after.rules`** — UFW auto-backs these up, but verify your custom chains if any exist.
- **After flushing bans, expect a brief spike in 4xx responses** as scanners that were previously blocked hit Apache again. Fail2ban will re-ban them within minutes.
- **The Netdata `web_log_1m_successful` alert may fire** during this window — it will self-clear once bans repopulate.
---
## 🔎 Quick Diagnostic Commands
```bash
# Fail2ban memory usage
grep -E "VmRSS|VmSwap|Threads" /proc/$(pgrep -ox fail2ban-server)/status
# Count nftables rules
nft list ruleset | grep -c "reject\|drop"
# UFW rule count
ufw status numbered | tail -1
# List all jails with ban counts
for jail in $(fail2ban-client status | grep "Jail list" | sed 's/.*://;s/,/ /g'); do
banned=$(fail2ban-client status $jail | grep "Currently banned" | awk '{print $NF}')
total=$(fail2ban-client status $jail | grep "Total banned" | awk '{print $NF}')
echo "$jail: $banned current / $total total"
done
# Flush all bans
fail2ban-client unban --all
```