Add wiki article: Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
New article covering the conversion from per-ban email alerts to a three-tier model (silent default, sshd/recidive immediate, daily digest). Includes Ansible automation, gotchas with lineinfile regex collisions, and fq-hostname override for clean subjects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f9c61fbac3
commit
46ae9ac97e
2 changed files with 174 additions and 1 deletions
173
02-selfhosting/security/fail2ban-digest-mode-fleet.md
Normal file
173
02-selfhosting/security/fail2ban-digest-mode-fleet.md
Normal file
|
|
@ -0,0 +1,173 @@
|
||||||
|
---
|
||||||
|
title: "Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts"
|
||||||
|
domain: selfhosting
|
||||||
|
category: security
|
||||||
|
tags: [fail2ban, security, email, ansible, fleet, cron, digest]
|
||||||
|
status: published
|
||||||
|
created: 2026-04-22
|
||||||
|
updated: 2026-04-22
|
||||||
|
---
|
||||||
|
# Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
|
||||||
|
|
||||||
|
## The Problem
|
||||||
|
|
||||||
|
Fail2Ban's default `action_mwl` sends an email for every single ban — IP address, whois lookup, and relevant log lines. On a fleet of 9 servers running 50+ jails, this floods the inbox with hundreds of emails daily. The vast majority are routine scanner bans (probing `.env`, `wp-login.php`, random PHP paths) that require no human attention.
|
||||||
|
|
||||||
|
The signal-to-noise ratio is terrible. Genuinely important events — SSH brute-force attempts, recidive escalations — get buried in the noise.
|
||||||
|
|
||||||
|
## The Solution — Tiered Alert Model
|
||||||
|
|
||||||
|
Three tiers replace the firehose:
|
||||||
|
|
||||||
|
| Tier | Jails | Action | Why |
|
||||||
|
|------|-------|--------|-----|
|
||||||
|
| **Immediate email** | `sshd`, `recidive` | `action_mwl` | Security-critical — someone is actively targeting auth or is a repeat offender |
|
||||||
|
| **Silent ban** | Everything else | `action_` (default) | Ban happens, firewall rule applied, no email sent |
|
||||||
|
| **Daily digest** | All jails | Cron script at 08:00 UTC | One summary email per host with ban counts across all jails |
|
||||||
|
|
||||||
|
This reduces email volume from hundreds per day to ~10 (one digest per host + occasional sshd/recidive alerts).
|
||||||
|
|
||||||
|
## jail.local Configuration
|
||||||
|
|
||||||
|
### Set silent ban as the default
|
||||||
|
|
||||||
|
In `[DEFAULT]`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[DEFAULT]
|
||||||
|
action = %(action_)s
|
||||||
|
```
|
||||||
|
|
||||||
|
This overrides the stock `action_mwl` for all jails. Bans still happen — the firewall rule is applied — but no email is sent.
|
||||||
|
|
||||||
|
### Keep immediate alerts for critical jails
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[sshd]
|
||||||
|
enabled = true
|
||||||
|
action = %(action_mwl)s
|
||||||
|
|
||||||
|
[recidive]
|
||||||
|
enabled = true
|
||||||
|
action = %(action_mwl)s
|
||||||
|
```
|
||||||
|
|
||||||
|
### Clean up email subjects with fq-hostname
|
||||||
|
|
||||||
|
By default, fail2ban uses the system FQDN in email subjects. On Tailscale hosts, this produces ugly subjects like `[Fail2Ban] sshd: banned 1.2.3.4 on MajorToot.tail7f2d9.ts.net`. Override it in `[DEFAULT]`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[DEFAULT]
|
||||||
|
fq-hostname = majortoot
|
||||||
|
```
|
||||||
|
|
||||||
|
This sets the `<fq-hostname>` tag used in action templates, producing cleaner subjects: `[Fail2Ban] sshd: banned 1.2.3.4 on majortoot`.
|
||||||
|
|
||||||
|
## Daily Digest Script
|
||||||
|
|
||||||
|
A shell script at `/usr/local/bin/fail2ban-digest.sh` runs via cron at 08:00 UTC. It does the following:
|
||||||
|
|
||||||
|
1. Queries every active jail via `fail2ban-client status <jail>`
|
||||||
|
2. Collects four metrics per jail: currently banned, total banned, currently failed, total failed
|
||||||
|
3. Builds a plain-text email body with one line per jail
|
||||||
|
4. Separates "active" jails (with bans) from "quiet" jails (zero bans) for quick scanning
|
||||||
|
5. Sets the subject line to include the total banned count (e.g., `[Fail2Ban Digest] majortoot — 47 total bans`)
|
||||||
|
6. Sends via `sendmail`
|
||||||
|
|
||||||
|
The script is templated and deployed by Ansible. It lives on each host at `/usr/local/bin/fail2ban-digest.sh`.
|
||||||
|
|
||||||
|
### Cron entry
|
||||||
|
|
||||||
|
```
|
||||||
|
0 8 * * * /usr/local/bin/fail2ban-digest.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Managed by `ansible.builtin.cron` — no manual crontab editing needed.
|
||||||
|
|
||||||
|
## Ansible Deployment
|
||||||
|
|
||||||
|
The playbook `configure_fail2ban_digest.yml` deploys the full digest model fleet-wide.
|
||||||
|
|
||||||
|
### What it does
|
||||||
|
|
||||||
|
1. Deploys a Python helper script that performs **section-aware editing** of `jail.local` (see gotchas below)
|
||||||
|
2. Sets `action = %(action_)s` in `[DEFAULT]`
|
||||||
|
3. Sets `action = %(action_mwl)s` in `[sshd]` and `[recidive]`
|
||||||
|
4. Sets `fq-hostname` per host using an override dict
|
||||||
|
5. Deploys the digest script from a Jinja2 template
|
||||||
|
6. Creates the cron job via `ansible.builtin.cron`
|
||||||
|
7. Restarts fail2ban
|
||||||
|
|
||||||
|
### Host-specific overrides
|
||||||
|
|
||||||
|
Two dictionaries in the playbook vars handle per-host customization:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
fail2ban_sender_overrides:
|
||||||
|
majormail: "fail2ban@majorshouse.com"
|
||||||
|
dcaprod: "fail2ban@dcanalysts.net"
|
||||||
|
|
||||||
|
fail2ban_hostname_overrides:
|
||||||
|
majortoot: "majortoot"
|
||||||
|
teelia: "teelia"
|
||||||
|
majormail: "majormail"
|
||||||
|
```
|
||||||
|
|
||||||
|
These feed into the Python editor script and the digest template.
|
||||||
|
|
||||||
|
### Why not lineinfile?
|
||||||
|
|
||||||
|
The playbook uses a Python script for `jail.local` editing instead of Ansible's `lineinfile` module. This is deliberate — see the gotchas section below.
|
||||||
|
|
||||||
|
## Gotchas
|
||||||
|
|
||||||
|
### lineinfile matches stock action definitions
|
||||||
|
|
||||||
|
Using `lineinfile` with a regex like `regexp: '^action\s*=\s*%(action_mwl)s'` is dangerous. In a full `jail.local` that includes stock action *definitions* (copied from `jail.conf`), the regex matches lines like:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
action_mwl = %(action_mwl)s
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the stock definition of the `action_mwl` macro itself — not the `action =` assignment in `[DEFAULT]`. `lineinfile` replaces the wrong line, corrupting the config. Fail2ban then refuses to start because the `action_mwl` macro is undefined.
|
||||||
|
|
||||||
|
**Solution:** Use a Python script that parses INI sections and only modifies the `action` key within the correct `[section]`. This is what the Ansible playbook does.
|
||||||
|
|
||||||
|
### Duplicate action lines crash fail2ban
|
||||||
|
|
||||||
|
If a section ends up with two `action =` lines (e.g., from a botched `lineinfile` run), fail2ban refuses to load:
|
||||||
|
|
||||||
|
```
|
||||||
|
option 'action' in section 'DEFAULT' already exists
|
||||||
|
```
|
||||||
|
|
||||||
|
The Python editor script handles this by replacing existing keys rather than appending.
|
||||||
|
|
||||||
|
### fq-hostname scope
|
||||||
|
|
||||||
|
Setting `fq-hostname` in `[DEFAULT]` affects all action templates that use the `<fq-hostname>` tag — including both immediate emails and the digest subject. This is the desired behavior, but be aware that it overrides the system hostname globally within fail2ban.
|
||||||
|
|
||||||
|
## Verifying the Setup
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check that default action is silent
|
||||||
|
fail2ban-client get DEFAULT action
|
||||||
|
# Should NOT contain 'sendmail' or 'mail'
|
||||||
|
|
||||||
|
# Check that sshd still sends email
|
||||||
|
fail2ban-client get sshd action
|
||||||
|
# Should contain 'sendmail-whois-lines' or similar
|
||||||
|
|
||||||
|
# Trigger a test digest
|
||||||
|
/usr/local/bin/fail2ban-digest.sh
|
||||||
|
|
||||||
|
# Check cron is installed
|
||||||
|
crontab -l | grep fail2ban-digest
|
||||||
|
```
|
||||||
|
|
||||||
|
## See Also
|
||||||
|
|
||||||
|
- [fail2ban-apache-404-scanner-jail](fail2ban-apache-404-scanner-jail.md) — custom jail for catching 404 scanners
|
||||||
|
- [fail2ban-wordpress-login-jail](fail2ban-wordpress-login-jail.md) — WordPress brute-force jail
|
||||||
|
- [ssh-hardening-ansible-fleet](ssh-hardening-ansible-fleet.md) — fleet SSH hardening
|
||||||
|
- [clamav-fleet-deployment](clamav-fleet-deployment.md) — another fleet-wide security deployment via Ansible
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-04-02T16:03
|
created: 2026-04-02T16:03
|
||||||
updated: 2026-04-18T18:48
|
updated: 2026-04-21T09:17
|
||||||
---
|
---
|
||||||
* [Home](index.md)
|
* [Home](index.md)
|
||||||
* [Linux & Sysadmin](01-linux/index.md)
|
* [Linux & Sysadmin](01-linux/index.md)
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue