majorwiki/02-selfhosting/security/fail2ban-digest-mode-fleet.md
majorlinux 46ae9ac97e Add wiki article: Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts
New article covering the conversion from per-ban email alerts to a
three-tier model (silent default, sshd/recidive immediate, daily digest).
Includes Ansible automation, gotchas with lineinfile regex collisions,
and fq-hostname override for clean subjects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-22 09:21:49 -04:00

6.2 KiB

title domain category tags status created updated
Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts selfhosting security
fail2ban
security
email
ansible
fleet
cron
digest
published 2026-04-22 2026-04-22

Fail2Ban Digest Mode — Fleet-Wide Quiet Alerts

The Problem

Fail2Ban's default action_mwl sends an email for every single ban — IP address, whois lookup, and relevant log lines. On a fleet of 9 servers running 50+ jails, this floods the inbox with hundreds of emails daily. The vast majority are routine scanner bans (probing .env, wp-login.php, random PHP paths) that require no human attention.

The signal-to-noise ratio is terrible. Genuinely important events — SSH brute-force attempts, recidive escalations — get buried in the noise.

The Solution — Tiered Alert Model

Three tiers replace the firehose:

Tier Jails Action Why
Immediate email sshd, recidive action_mwl Security-critical — someone is actively targeting auth or is a repeat offender
Silent ban Everything else action_ (default) Ban happens, firewall rule applied, no email sent
Daily digest All jails Cron script at 08:00 UTC One summary email per host with ban counts across all jails

This reduces email volume from hundreds per day to ~10 (one digest per host + occasional sshd/recidive alerts).

jail.local Configuration

Set silent ban as the default

In [DEFAULT]:

[DEFAULT]
action = %(action_)s

This overrides the stock action_mwl for all jails. Bans still happen — the firewall rule is applied — but no email is sent.

Keep immediate alerts for critical jails

[sshd]
enabled = true
action = %(action_mwl)s

[recidive]
enabled = true
action = %(action_mwl)s

Clean up email subjects with fq-hostname

By default, fail2ban uses the system FQDN in email subjects. On Tailscale hosts, this produces ugly subjects like [Fail2Ban] sshd: banned 1.2.3.4 on MajorToot.tail7f2d9.ts.net. Override it in [DEFAULT]:

[DEFAULT]
fq-hostname = majortoot

This sets the <fq-hostname> tag used in action templates, producing cleaner subjects: [Fail2Ban] sshd: banned 1.2.3.4 on majortoot.

Daily Digest Script

A shell script at /usr/local/bin/fail2ban-digest.sh runs via cron at 08:00 UTC. It does the following:

  1. Queries every active jail via fail2ban-client status <jail>
  2. Collects four metrics per jail: currently banned, total banned, currently failed, total failed
  3. Builds a plain-text email body with one line per jail
  4. Separates "active" jails (with bans) from "quiet" jails (zero bans) for quick scanning
  5. Sets the subject line to include the total banned count (e.g., [Fail2Ban Digest] majortoot — 47 total bans)
  6. Sends via sendmail

The script is templated and deployed by Ansible. It lives on each host at /usr/local/bin/fail2ban-digest.sh.

Cron entry

0 8 * * * /usr/local/bin/fail2ban-digest.sh

Managed by ansible.builtin.cron — no manual crontab editing needed.

Ansible Deployment

The playbook configure_fail2ban_digest.yml deploys the full digest model fleet-wide.

What it does

  1. Deploys a Python helper script that performs section-aware editing of jail.local (see gotchas below)
  2. Sets action = %(action_)s in [DEFAULT]
  3. Sets action = %(action_mwl)s in [sshd] and [recidive]
  4. Sets fq-hostname per host using an override dict
  5. Deploys the digest script from a Jinja2 template
  6. Creates the cron job via ansible.builtin.cron
  7. Restarts fail2ban

Host-specific overrides

Two dictionaries in the playbook vars handle per-host customization:

fail2ban_sender_overrides:
  majormail: "fail2ban@majorshouse.com"
  dcaprod: "fail2ban@dcanalysts.net"

fail2ban_hostname_overrides:
  majortoot: "majortoot"
  teelia: "teelia"
  majormail: "majormail"

These feed into the Python editor script and the digest template.

Why not lineinfile?

The playbook uses a Python script for jail.local editing instead of Ansible's lineinfile module. This is deliberate — see the gotchas section below.

Gotchas

lineinfile matches stock action definitions

Using lineinfile with a regex like regexp: '^action\s*=\s*%(action_mwl)s' is dangerous. In a full jail.local that includes stock action definitions (copied from jail.conf), the regex matches lines like:

action_mwl = %(action_mwl)s

This is the stock definition of the action_mwl macro itself — not the action = assignment in [DEFAULT]. lineinfile replaces the wrong line, corrupting the config. Fail2ban then refuses to start because the action_mwl macro is undefined.

Solution: Use a Python script that parses INI sections and only modifies the action key within the correct [section]. This is what the Ansible playbook does.

Duplicate action lines crash fail2ban

If a section ends up with two action = lines (e.g., from a botched lineinfile run), fail2ban refuses to load:

option 'action' in section 'DEFAULT' already exists

The Python editor script handles this by replacing existing keys rather than appending.

fq-hostname scope

Setting fq-hostname in [DEFAULT] affects all action templates that use the <fq-hostname> tag — including both immediate emails and the digest subject. This is the desired behavior, but be aware that it overrides the system hostname globally within fail2ban.

Verifying the Setup

# Check that default action is silent
fail2ban-client get DEFAULT action
# Should NOT contain 'sendmail' or 'mail'

# Check that sshd still sends email
fail2ban-client get sshd action
# Should contain 'sendmail-whois-lines' or similar

# Trigger a test digest
/usr/local/bin/fail2ban-digest.sh

# Check cron is installed
crontab -l | grep fail2ban-digest

See Also