From 110a6d49e5ffceb8e0d9c085a4a0405a278cdbcf Mon Sep 17 00:00:00 2001 From: Marcus Summers Date: Thu, 4 Jun 2026 16:16:29 -0400 Subject: [PATCH] wiki: add inbound spam filtering guide (spamass-milter + SpamAssassin Bayes) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New 02-selfhosting/services article: the full Postfix/Dovecot inbound spam stack on Fedora — spamass-milter tag-only wiring (the -r footgun), socket permissions (sa-milt group + UMask), site-wide Bayes DB, Sieve-to-Junk, and sa-learn training (folders, spam/ham balance, manual-not-cron). From the majormail setup. Also extends selinux-dovecot-vmail-context with a Permissive-mode variant + a postfix_cleanup->mysqld_etc companion-denial note. SUMMARY.md nav updated. --- ...stfix-spamassassin-bayes-spam-filtering.md | 171 ++++++++++++++++++ .../selinux-dovecot-vmail-context.md | 23 +++ SUMMARY.md | 1 + 3 files changed, 195 insertions(+) create mode 100644 02-selfhosting/services/postfix-spamassassin-bayes-spam-filtering.md diff --git a/02-selfhosting/services/postfix-spamassassin-bayes-spam-filtering.md b/02-selfhosting/services/postfix-spamassassin-bayes-spam-filtering.md new file mode 100644 index 0000000..81ecb79 --- /dev/null +++ b/02-selfhosting/services/postfix-spamassassin-bayes-spam-filtering.md @@ -0,0 +1,171 @@ +--- +title: "Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes on Postfix/Dovecot (Fedora)" +domain: selfhosting +category: services +tags: [postfix, dovecot, spamassassin, spamass-milter, bayes, spam, sieve, fedora, email, selinux] +status: published +created: 2026-06-04 +updated: 2026-06-04 +--- +# Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes on Postfix/Dovecot + +How to add inbound spam scanning to a Postfix/Dovecot virtual-mailbox server on Fedora: SpamAssassin scans every inbound message via `spamass-milter`, spam is **tagged (never rejected)**, Dovecot's Sieve files it into the user's `Junk` folder, and a **site-wide Bayes database** — shared between the scan path and manual `sa-learn` training — learns from your real mail. + +This is a "tag and quarantine" design (not "reject at SMTP"), which is the safe default: a misfire lands a message in Junk for review rather than bouncing legitimate mail. + +## Architecture + +``` +inbound SMTP (25) ─► Postfix smtpd + │ smtpd_milters: + │ 1. OpenDKIM (verify/sign) + │ 2. spamass-milter ─► spamc ─► spamd (SpamAssassin) + │ adds X-Spam-Flag / X-Spam-Status headers + ▼ + Dovecot LMTP delivery ─► global Sieve + if X-Spam-Flag: YES ─► fileinto "Junk" + else ─► INBOX + +Bayes DB /var/lib/spamassassin/bayes/ (site-wide, shared) + ├─ spamd auto-learns at scan time + └─ sa-learn manual/scripted training from Maildir folders +``` + +## 1. Install + +```bash +sudo dnf install spamassassin spamass-milter +sudo systemctl enable --now spamassassin # spamd +``` + +On Fedora the `spamass-milter` unit runs as the unprivileged **`sa-milt`** user and creates its socket at `/run/spamass-milter/spamass-milter.sock`. Remember that user — the Bayes DB ownership and the socket permissions both hinge on it. + +## 2. Configure spamass-milter — tag-only + +Edit `/etc/sysconfig/spamass-milter`: + +```sh +EXTRA_FLAGS="-a -r 999999" +``` + +> [!warning] The `-r` flag is a footgun +> `-r nn` rejects mail scoring ≥ `nn` at SMTP time. **Omitting `-r` does NOT mean "never reject"** — this build still rejects flagged spam at a low default threshold (a GTUBE test will get `550 Blocked by SpamAssassin`). To get pure tag-only behaviour, set the threshold absurdly high (`-r 999999`) so nothing ever reaches it. Do **not** use `-r -1` — that means "reject anything tagged as spam." + +- `-a` — skip messages on **authenticated** connections, so your own outbound/submission mail isn't scanned or tagged. + +## 3. Socket permissions (so Postfix can connect) + +The socket is created `0770 sa-milt:sa-milt` only if you widen the unit's umask; by default it's `0755` and Postfix (running as `postfix`) can't write to it. Two steps: + +```bash +# 1. Let the socket be group-accessible +sudo install -d /etc/systemd/system/spamass-milter.service.d +printf '[Service]\nUMask=0007\n' | sudo tee /etc/systemd/system/spamass-milter.service.d/socket-perms.conf + +# 2. Put postfix in the sa-milt group, then RESTART postfix (group is read at start) +sudo usermod -aG sa-milt postfix + +sudo systemctl daemon-reload +sudo systemctl enable --now spamass-milter +``` + +Verify: `sudo -u postfix test -w /run/spamass-milter/spamass-milter.sock && echo OK`. + +## 4. Wire into Postfix + +Append the milter **alongside** OpenDKIM — don't replace it. Inbound (`smtpd`) gets both; local-injected mail (`non_smtpd`) stays DKIM-only. + +```bash +postconf -e 'smtpd_milters = local:/run/opendkim/opendkim.sock unix:/run/spamass-milter/spamass-milter.sock' +postconf -e 'milter_default_action = accept' # if SA is down, accept the mail — never defer/bounce +sudo systemctl restart postfix # restart (not reload) to pick up the new group +``` + +`milter_default_action = accept` is important: if the milter ever hiccups, mail still flows. + +## 5. Site-wide Bayes DB + +Put the Bayes DB in one fixed location so the scan path and your training script share it. In `/etc/mail/spamassassin/local.cf`: + +``` +use_bayes 1 +bayes_auto_learn 1 +bayes_path /var/lib/spamassassin/bayes/bayes +bayes_file_mode 0660 +``` + +Create the directory owned by the **scanning user** (`sa-milt`), under `/var/lib/spamassassin` so it inherits the correct SELinux type (`spamd_var_lib_t`): + +```bash +sudo install -d -m 2770 -o sa-milt -g sa-milt /var/lib/spamassassin/bayes +sudo restorecon -Rv /var/lib/spamassassin/bayes +sudo systemctl restart spamassassin +``` + +The `2770` setgid + `bayes_file_mode 0660` means whether the DB is written by `spamd` (as `sa-milt`) or by `sa-learn` (as `root`, from a training script), all parties can read and write it. + +## 6. File spam into Junk (Dovecot Sieve) + +A global Sieve before-script files anything SpamAssassin flagged. `/etc/dovecot/sieve/global/spam-to-junk.sieve`: + +```sieve +require ["fileinto", "mailbox"]; +if anyof (header :contains "X-Spam-Flag" "YES", header :contains "X-Spam-Status" "Yes") { + fileinto :create "Junk"; + stop; +} +``` + +Register it as a global script in `dovecot.conf` (e.g. `sieve_before = /etc/dovecot/sieve/global/spam-to-junk.sieve`) and restart Dovecot. + +## 7. Training the Bayes filter + +SpamAssassin's Bayes only starts scoring once it has learned **≥ 200 spam AND ≥ 200 ham** (`bayes_min_spam_num` / `bayes_min_ham_num`). Train from your Maildir folders with `sa-learn`. **Run it as `root`** — root can read every user's Maildir *and* write the Bayes DB. + +```bash +# Spam — your Junk folder(s) and any dedicated spam mailbox +sa-learn --spam /var/vmail/example.com/user/.Junk/{cur,new} + +# Ham — Sent + Inbox (known-good) +sa-learn --ham /var/vmail/example.com/user/{cur,new} +sa-learn --ham /var/vmail/example.com/user/.Sent/{cur,new} + +sa-learn --sync +sa-learn --dump magic | grep -E 'nspam|nham' +``` + +`bayes_path` is read from `local.cf`, so no `--dbpath` is needed. + +> [!tip] Keep spam and ham roughly balanced +> Bayes accuracy drops when one corpus dwarfs the other (aim for within ~3:1). Don't dump a 90,000-message archive of ham against a few hundred spam — it biases everything toward "ham" and spam slips through. Use Sent + recent Inbox for ham, not your entire archive. + +> [!warning] Train manually, not from cron — unless your folders are always clean +> `sa-learn` learns whatever is *in* the folder. If a spam slips into the Inbox, or you haven't yet rescued a false-positive out of Junk, an unattended cron run will mislearn it. Prefer a manual script you run **after** triaging Junk/Inbox. (`sa-learn` is idempotent and re-classifies on re-run, so a mistake is fixable: move the message to the right folder and run again.) + +## 8. Test + +Send a [GTUBE](https://spamassassin.apache.org/gtube/) probe through port 25 (unauthenticated) and a normal message: + +```bash +# from a host that can reach :25 — GTUBE scores ~1000 +printf 'Subject: gtube\n\nXJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X\n' \ + | sendmail -f test@example.org user@example.com +``` + +Confirm in `/var/log/maillog` that `spamd` scanned it (`result: Y …`), the message was **delivered** (no `milter-reject`), it landed in `.Junk`, and the stored message has `X-Spam-Flag: YES`. + +## Gotchas recap + +| Symptom | Cause | Fix | +|---|---|---| +| Spam gets `550 Blocked by SpamAssassin` (you wanted Junk) | spamass-milter rejects at a default threshold | `-r 999999` for tag-only | +| Postfix can't reach the milter socket | socket `0755`, postfix not in `sa-milt` group | `UMask=0007` drop-in + `usermod -aG sa-milt postfix` + restart postfix | +| `sa-learn` trains but `spamd` doesn't use it | per-user vs site Bayes mismatch | set `bayes_path` in `local.cf` (site-wide) | +| Bayes never scores (`BAYES_*` absent) | below the 200/200 learn floor | train more, keep spam/ham balanced | +| Your own outbound mail gets tagged | scanning authenticated mail | `-a` flag | +| AVC denials on the Bayes DB (SELinux) | DB outside `/var/lib/spamassassin` | keep it under that path (`spamd_var_lib_t`) + `restorecon` | + +## See also + +- [[selinux-dovecot-vmail-context|SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)]] +- [[linux-server-hardening-checklist|Linux Server Hardening Checklist]] (basic `sa-learn` section) diff --git a/05-troubleshooting/selinux-dovecot-vmail-context.md b/05-troubleshooting/selinux-dovecot-vmail-context.md index fdfe379..86389c9 100644 --- a/05-troubleshooting/selinux-dovecot-vmail-context.md +++ b/05-troubleshooting/selinux-dovecot-vmail-context.md @@ -98,6 +98,29 @@ ausearch -m avc -ts recent | grep dovecot No output = no new denials. +## Variant: a Freshly-Rebuilt Box Left in Permissive Mode + +If a server was rebuilt or migrated and came up **Permissive** (check `getenforce`), the symptom flips: mail works fine, but `/var/log/audit/audit.log` quietly fills with thousands of `dovecot_t → var_t` denials that *would* break IMAP/LMTP the instant you switch to Enforcing. The mailstore was created or `rsync`'d onto `/var/vmail` with no fcontext rule, so it defaulted to `var_t`. + +Apply the relabel above first, then flip to Enforcing **only after** verifying zero new denials: + +```bash +MARK=$(date +%H:%M:%S) +# ...deliver a test message + do an IMAP login... +ausearch -m avc -ts "$MARK" | grep -c denied # expect 0 +setenforce 1 +sed -i 's/^SELINUX=permissive/SELINUX=enforcing/' /etc/selinux/config +``` + +**Companion denial:** a Postfix virtual-mailbox server that looks up recipients in MySQL also trips `postfix_cleanup_t` reading `/etc/my.cnf*` (`mysqld_etc_t`). Allow it with a small local module: + +```bash +ausearch -m avc -c cleanup | audit2allow -M local_postfix_mysql +semodule -i local_postfix_mysql.pp +``` + +See also [[postfix-spamassassin-bayes-spam-filtering|Inbound Spam Filtering]] — the SpamAssassin Bayes DB belongs under `/var/lib/spamassassin` (`spamd_var_lib_t`) for the same labeling reason. + ## Key Notes - **One rule is enough** — `"/var/vmail(/.*)?"` with `mail_spool_t` covers every file and directory under `/var/vmail`, including all `tmp/` subdirectories. diff --git a/SUMMARY.md b/SUMMARY.md index a762b89..8730d92 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -42,6 +42,7 @@ updated: 2026-05-15T09:00 * [Mastodon — The `--prune-profiles` Trap and How to Recover](02-selfhosting/services/mastodon-prune-profiles-trap.md) * [Mastodon on S3 — Silent Upload Failures (BucketOwnerEnforced/ACLs)](02-selfhosting/services/mastodon-s3-acl-upload-failures.md) * [Ghost Email Configuration with Mailgun](02-selfhosting/services/ghost-smtp-mailgun-setup.md) + * [Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes](02-selfhosting/services/postfix-spamassassin-bayes-spam-filtering.md) * [Claude Code Remote Control — Mobile Access to a Persistent Host Session](02-selfhosting/services/claude-code-remote-control.md) * [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) * [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md)