Add wiki article: Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts

New article covering the conversion from per-ban email alerts to a
three-tier model (silent default, sshd/recidive immediate, daily digest).
Includes Ansible automation, gotchas with lineinfile regex collisions,
and fq-hostname override for clean subjects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Marcus Summers 2026-04-22 09:21:49 -04:00
parent f9c61fbac3
commit 46ae9ac97e
2 changed files with 174 additions and 1 deletions

View file

@ -0,0 +1,173 @@
---
title: "Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts"
domain: selfhosting
category: security
tags: [fail2ban, security, email, ansible, fleet, cron, digest]
status: published
created: 2026-04-22
updated: 2026-04-22
---
# Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
## The Problem
Fail2Ban's default `action_mwl` sends an email for every single ban — IP address, whois lookup, and relevant log lines. On a fleet of 9 servers running 50+ jails, this floods the inbox with hundreds of emails daily. The vast majority are routine scanner bans (probing `.env`, `wp-login.php`, random PHP paths) that require no human attention.
The signal-to-noise ratio is terrible. Genuinely important events — SSH brute-force attempts, recidive escalations — get buried in the noise.
## The Solution — Tiered Alert Model
Three tiers replace the firehose:
| Tier | Jails | Action | Why |
|------|-------|--------|-----|
| **Immediate email** | `sshd`, `recidive` | `action_mwl` | Security-critical — someone is actively targeting auth or is a repeat offender |
| **Silent ban** | Everything else | `action_` (default) | Ban happens, firewall rule applied, no email sent |
| **Daily digest** | All jails | Cron script at 08:00 UTC | One summary email per host with ban counts across all jails |
This reduces email volume from hundreds per day to ~10 (one digest per host + occasional sshd/recidive alerts).
## jail.local Configuration
### Set silent ban as the default
In `[DEFAULT]`:
```ini
[DEFAULT]
action = %(action_)s
```
This overrides the stock `action_mwl` for all jails. Bans still happen — the firewall rule is applied — but no email is sent.
### Keep immediate alerts for critical jails
```ini
[sshd]
enabled = true
action = %(action_mwl)s
[recidive]
enabled = true
action = %(action_mwl)s
```
### Clean up email subjects with fq-hostname
By default, fail2ban uses the system FQDN in email subjects. On Tailscale hosts, this produces ugly subjects like `[Fail2Ban] sshd: banned 1.2.3.4 on MajorToot.tail7f2d9.ts.net`. Override it in `[DEFAULT]`:
```ini
[DEFAULT]
fq-hostname = majortoot
```
This sets the `<fq-hostname>` tag used in action templates, producing cleaner subjects: `[Fail2Ban] sshd: banned 1.2.3.4 on majortoot`.
## Daily Digest Script
A shell script at `/usr/local/bin/fail2ban-digest.sh` runs via cron at 08:00 UTC. It does the following:
1. Queries every active jail via `fail2ban-client status <jail>`
2. Collects four metrics per jail: currently banned, total banned, currently failed, total failed
3. Builds a plain-text email body with one line per jail
4. Separates "active" jails (with bans) from "quiet" jails (zero bans) for quick scanning
5. Sets the subject line to include the total banned count (e.g., `[Fail2Ban Digest] majortoot — 47 total bans`)
6. Sends via `sendmail`
The script is templated and deployed by Ansible. It lives on each host at `/usr/local/bin/fail2ban-digest.sh`.
### Cron entry
```
0 8 * * * /usr/local/bin/fail2ban-digest.sh
```
Managed by `ansible.builtin.cron` — no manual crontab editing needed.
## Ansible Deployment
The playbook `configure_fail2ban_digest.yml` deploys the full digest model fleet-wide.
### What it does
1. Deploys a Python helper script that performs **section-aware editing** of `jail.local` (see gotchas below)
2. Sets `action = %(action_)s` in `[DEFAULT]`
3. Sets `action = %(action_mwl)s` in `[sshd]` and `[recidive]`
4. Sets `fq-hostname` per host using an override dict
5. Deploys the digest script from a Jinja2 template
6. Creates the cron job via `ansible.builtin.cron`
7. Restarts fail2ban
### Host-specific overrides
Two dictionaries in the playbook vars handle per-host customization:
```yaml
fail2ban_sender_overrides:
majormail: "fail2ban@majorshouse.com"
dcaprod: "fail2ban@dcanalysts.net"
fail2ban_hostname_overrides:
majortoot: "majortoot"
teelia: "teelia"
majormail: "majormail"
```
These feed into the Python editor script and the digest template.
### Why not lineinfile?
The playbook uses a Python script for `jail.local` editing instead of Ansible's `lineinfile` module. This is deliberate — see the gotchas section below.
## Gotchas
### lineinfile matches stock action definitions
Using `lineinfile` with a regex like `regexp: '^action\s*=\s*%(action_mwl)s'` is dangerous. In a full `jail.local` that includes stock action *definitions* (copied from `jail.conf`), the regex matches lines like:
```ini
action_mwl = %(action_mwl)s
```
This is the stock definition of the `action_mwl` macro itself — not the `action =` assignment in `[DEFAULT]`. `lineinfile` replaces the wrong line, corrupting the config. Fail2ban then refuses to start because the `action_mwl` macro is undefined.
**Solution:** Use a Python script that parses INI sections and only modifies the `action` key within the correct `[section]`. This is what the Ansible playbook does.
### Duplicate action lines crash fail2ban
If a section ends up with two `action =` lines (e.g., from a botched `lineinfile` run), fail2ban refuses to load:
```
option 'action' in section 'DEFAULT' already exists
```
The Python editor script handles this by replacing existing keys rather than appending.
### fq-hostname scope
Setting `fq-hostname` in `[DEFAULT]` affects all action templates that use the `<fq-hostname>` tag — including both immediate emails and the digest subject. This is the desired behavior, but be aware that it overrides the system hostname globally within fail2ban.
## Verifying the Setup
```bash
# Check that default action is silent
fail2ban-client get DEFAULT action
# Should NOT contain 'sendmail' or 'mail'
# Check that sshd still sends email
fail2ban-client get sshd action
# Should contain 'sendmail-whois-lines' or similar
# Trigger a test digest
/usr/local/bin/fail2ban-digest.sh
# Check cron is installed
crontab -l | grep fail2ban-digest
```
## See Also
- [fail2ban-apache-404-scanner-jail](fail2ban-apache-404-scanner-jail.md) — custom jail for catching 404 scanners
- [fail2ban-wordpress-login-jail](fail2ban-wordpress-login-jail.md) — WordPress brute-force jail
- [ssh-hardening-ansible-fleet](ssh-hardening-ansible-fleet.md) — fleet SSH hardening
- [clamav-fleet-deployment](clamav-fleet-deployment.md) — another fleet-wide security deployment via Ansible

View file

@ -1,6 +1,6 @@
---
created: 2026-04-02T16:03
updated: 2026-04-18T18:48
updated: 2026-04-21T09:17
---
* [Home](index.md)
* [Linux & Sysadmin](01-linux/index.md)