Add wiki article: Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
New article covering the conversion from per-ban email alerts to a three-tier model (silent default, sshd/recidive immediate, daily digest). Includes Ansible automation, gotchas with lineinfile regex collisions, and fq-hostname override for clean subjects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f9c61fbac3
commit
46ae9ac97e
2 changed files with 174 additions and 1 deletions
173
02-selfhosting/security/fail2ban-digest-mode-fleet.md
Normal file
173
02-selfhosting/security/fail2ban-digest-mode-fleet.md
Normal file
|
|
@ -0,0 +1,173 @@
|
|||
---
|
||||
title: "Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts"
|
||||
domain: selfhosting
|
||||
category: security
|
||||
tags: [fail2ban, security, email, ansible, fleet, cron, digest]
|
||||
status: published
|
||||
created: 2026-04-22
|
||||
updated: 2026-04-22
|
||||
---
|
||||
# Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
|
||||
|
||||
## The Problem
|
||||
|
||||
Fail2Ban's default `action_mwl` sends an email for every single ban — IP address, whois lookup, and relevant log lines. On a fleet of 9 servers running 50+ jails, this floods the inbox with hundreds of emails daily. The vast majority are routine scanner bans (probing `.env`, `wp-login.php`, random PHP paths) that require no human attention.
|
||||
|
||||
The signal-to-noise ratio is terrible. Genuinely important events — SSH brute-force attempts, recidive escalations — get buried in the noise.
|
||||
|
||||
## The Solution — Tiered Alert Model
|
||||
|
||||
Three tiers replace the firehose:
|
||||
|
||||
| Tier | Jails | Action | Why |
|
||||
|------|-------|--------|-----|
|
||||
| **Immediate email** | `sshd`, `recidive` | `action_mwl` | Security-critical — someone is actively targeting auth or is a repeat offender |
|
||||
| **Silent ban** | Everything else | `action_` (default) | Ban happens, firewall rule applied, no email sent |
|
||||
| **Daily digest** | All jails | Cron script at 08:00 UTC | One summary email per host with ban counts across all jails |
|
||||
|
||||
This reduces email volume from hundreds per day to ~10 (one digest per host + occasional sshd/recidive alerts).
|
||||
|
||||
## jail.local Configuration
|
||||
|
||||
### Set silent ban as the default
|
||||
|
||||
In `[DEFAULT]`:
|
||||
|
||||
```ini
|
||||
[DEFAULT]
|
||||
action = %(action_)s
|
||||
```
|
||||
|
||||
This overrides the stock `action_mwl` for all jails. Bans still happen — the firewall rule is applied — but no email is sent.
|
||||
|
||||
### Keep immediate alerts for critical jails
|
||||
|
||||
```ini
|
||||
[sshd]
|
||||
enabled = true
|
||||
action = %(action_mwl)s
|
||||
|
||||
[recidive]
|
||||
enabled = true
|
||||
action = %(action_mwl)s
|
||||
```
|
||||
|
||||
### Clean up email subjects with fq-hostname
|
||||
|
||||
By default, fail2ban uses the system FQDN in email subjects. On Tailscale hosts, this produces ugly subjects like `[Fail2Ban] sshd: banned 1.2.3.4 on MajorToot.tail7f2d9.ts.net`. Override it in `[DEFAULT]`:
|
||||
|
||||
```ini
|
||||
[DEFAULT]
|
||||
fq-hostname = majortoot
|
||||
```
|
||||
|
||||
This sets the `<fq-hostname>` tag used in action templates, producing cleaner subjects: `[Fail2Ban] sshd: banned 1.2.3.4 on majortoot`.
|
||||
|
||||
## Daily Digest Script
|
||||
|
||||
A shell script at `/usr/local/bin/fail2ban-digest.sh` runs via cron at 08:00 UTC. It does the following:
|
||||
|
||||
1. Queries every active jail via `fail2ban-client status <jail>`
|
||||
2. Collects four metrics per jail: currently banned, total banned, currently failed, total failed
|
||||
3. Builds a plain-text email body with one line per jail
|
||||
4. Separates "active" jails (with bans) from "quiet" jails (zero bans) for quick scanning
|
||||
5. Sets the subject line to include the total banned count (e.g., `[Fail2Ban Digest] majortoot — 47 total bans`)
|
||||
6. Sends via `sendmail`
|
||||
|
||||
The script is templated and deployed by Ansible. It lives on each host at `/usr/local/bin/fail2ban-digest.sh`.
|
||||
|
||||
### Cron entry
|
||||
|
||||
```
|
||||
0 8 * * * /usr/local/bin/fail2ban-digest.sh
|
||||
```
|
||||
|
||||
Managed by `ansible.builtin.cron` — no manual crontab editing needed.
|
||||
|
||||
## Ansible Deployment
|
||||
|
||||
The playbook `configure_fail2ban_digest.yml` deploys the full digest model fleet-wide.
|
||||
|
||||
### What it does
|
||||
|
||||
1. Deploys a Python helper script that performs **section-aware editing** of `jail.local` (see gotchas below)
|
||||
2. Sets `action = %(action_)s` in `[DEFAULT]`
|
||||
3. Sets `action = %(action_mwl)s` in `[sshd]` and `[recidive]`
|
||||
4. Sets `fq-hostname` per host using an override dict
|
||||
5. Deploys the digest script from a Jinja2 template
|
||||
6. Creates the cron job via `ansible.builtin.cron`
|
||||
7. Restarts fail2ban
|
||||
|
||||
### Host-specific overrides
|
||||
|
||||
Two dictionaries in the playbook vars handle per-host customization:
|
||||
|
||||
```yaml
|
||||
fail2ban_sender_overrides:
|
||||
majormail: "fail2ban@majorshouse.com"
|
||||
dcaprod: "fail2ban@dcanalysts.net"
|
||||
|
||||
fail2ban_hostname_overrides:
|
||||
majortoot: "majortoot"
|
||||
teelia: "teelia"
|
||||
majormail: "majormail"
|
||||
```
|
||||
|
||||
These feed into the Python editor script and the digest template.
|
||||
|
||||
### Why not lineinfile?
|
||||
|
||||
The playbook uses a Python script for `jail.local` editing instead of Ansible's `lineinfile` module. This is deliberate — see the gotchas section below.
|
||||
|
||||
## Gotchas
|
||||
|
||||
### lineinfile matches stock action definitions
|
||||
|
||||
Using `lineinfile` with a regex like `regexp: '^action\s*=\s*%(action_mwl)s'` is dangerous. In a full `jail.local` that includes stock action *definitions* (copied from `jail.conf`), the regex matches lines like:
|
||||
|
||||
```ini
|
||||
action_mwl = %(action_mwl)s
|
||||
```
|
||||
|
||||
This is the stock definition of the `action_mwl` macro itself — not the `action =` assignment in `[DEFAULT]`. `lineinfile` replaces the wrong line, corrupting the config. Fail2ban then refuses to start because the `action_mwl` macro is undefined.
|
||||
|
||||
**Solution:** Use a Python script that parses INI sections and only modifies the `action` key within the correct `[section]`. This is what the Ansible playbook does.
|
||||
|
||||
### Duplicate action lines crash fail2ban
|
||||
|
||||
If a section ends up with two `action =` lines (e.g., from a botched `lineinfile` run), fail2ban refuses to load:
|
||||
|
||||
```
|
||||
option 'action' in section 'DEFAULT' already exists
|
||||
```
|
||||
|
||||
The Python editor script handles this by replacing existing keys rather than appending.
|
||||
|
||||
### fq-hostname scope
|
||||
|
||||
Setting `fq-hostname` in `[DEFAULT]` affects all action templates that use the `<fq-hostname>` tag — including both immediate emails and the digest subject. This is the desired behavior, but be aware that it overrides the system hostname globally within fail2ban.
|
||||
|
||||
## Verifying the Setup
|
||||
|
||||
```bash
|
||||
# Check that default action is silent
|
||||
fail2ban-client get DEFAULT action
|
||||
# Should NOT contain 'sendmail' or 'mail'
|
||||
|
||||
# Check that sshd still sends email
|
||||
fail2ban-client get sshd action
|
||||
# Should contain 'sendmail-whois-lines' or similar
|
||||
|
||||
# Trigger a test digest
|
||||
/usr/local/bin/fail2ban-digest.sh
|
||||
|
||||
# Check cron is installed
|
||||
crontab -l | grep fail2ban-digest
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [fail2ban-apache-404-scanner-jail](fail2ban-apache-404-scanner-jail.md) — custom jail for catching 404 scanners
|
||||
- [fail2ban-wordpress-login-jail](fail2ban-wordpress-login-jail.md) — WordPress brute-force jail
|
||||
- [ssh-hardening-ansible-fleet](ssh-hardening-ansible-fleet.md) — fleet SSH hardening
|
||||
- [clamav-fleet-deployment](clamav-fleet-deployment.md) — another fleet-wide security deployment via Ansible
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
created: 2026-04-02T16:03
|
||||
updated: 2026-04-18T18:48
|
||||
updated: 2026-04-21T09:17
|
||||
---
|
||||
* [Home](index.md)
|
||||
* [Linux & Sysadmin](01-linux/index.md)
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue