wiki: add inbound spam filtering guide (spamass-milter + SpamAssassin Bayes)
New 02-selfhosting/services article: the full Postfix/Dovecot inbound spam stack on Fedora — spamass-milter tag-only wiring (the -r footgun), socket permissions (sa-milt group + UMask), site-wide Bayes DB, Sieve-to-Junk, and sa-learn training (folders, spam/ham balance, manual-not-cron). From the majormail setup. Also extends selinux-dovecot-vmail-context with a Permissive-mode variant + a postfix_cleanup->mysqld_etc companion-denial note. SUMMARY.md nav updated.
This commit is contained in:
parent
e6a249403c
commit
110a6d49e5
3 changed files with 195 additions and 0 deletions
|
|
@ -0,0 +1,171 @@
|
|||
---
|
||||
title: "Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes on Postfix/Dovecot (Fedora)"
|
||||
domain: selfhosting
|
||||
category: services
|
||||
tags: [postfix, dovecot, spamassassin, spamass-milter, bayes, spam, sieve, fedora, email, selinux]
|
||||
status: published
|
||||
created: 2026-06-04
|
||||
updated: 2026-06-04
|
||||
---
|
||||
# Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes on Postfix/Dovecot
|
||||
|
||||
How to add inbound spam scanning to a Postfix/Dovecot virtual-mailbox server on Fedora: SpamAssassin scans every inbound message via `spamass-milter`, spam is **tagged (never rejected)**, Dovecot's Sieve files it into the user's `Junk` folder, and a **site-wide Bayes database** — shared between the scan path and manual `sa-learn` training — learns from your real mail.
|
||||
|
||||
This is a "tag and quarantine" design (not "reject at SMTP"), which is the safe default: a misfire lands a message in Junk for review rather than bouncing legitimate mail.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
inbound SMTP (25) ─► Postfix smtpd
|
||||
│ smtpd_milters:
|
||||
│ 1. OpenDKIM (verify/sign)
|
||||
│ 2. spamass-milter ─► spamc ─► spamd (SpamAssassin)
|
||||
│ adds X-Spam-Flag / X-Spam-Status headers
|
||||
▼
|
||||
Dovecot LMTP delivery ─► global Sieve
|
||||
if X-Spam-Flag: YES ─► fileinto "Junk"
|
||||
else ─► INBOX
|
||||
|
||||
Bayes DB /var/lib/spamassassin/bayes/ (site-wide, shared)
|
||||
├─ spamd auto-learns at scan time
|
||||
└─ sa-learn manual/scripted training from Maildir folders
|
||||
```
|
||||
|
||||
## 1. Install
|
||||
|
||||
```bash
|
||||
sudo dnf install spamassassin spamass-milter
|
||||
sudo systemctl enable --now spamassassin # spamd
|
||||
```
|
||||
|
||||
On Fedora the `spamass-milter` unit runs as the unprivileged **`sa-milt`** user and creates its socket at `/run/spamass-milter/spamass-milter.sock`. Remember that user — the Bayes DB ownership and the socket permissions both hinge on it.
|
||||
|
||||
## 2. Configure spamass-milter — tag-only
|
||||
|
||||
Edit `/etc/sysconfig/spamass-milter`:
|
||||
|
||||
```sh
|
||||
EXTRA_FLAGS="-a -r 999999"
|
||||
```
|
||||
|
||||
> [!warning] The `-r` flag is a footgun
|
||||
> `-r nn` rejects mail scoring ≥ `nn` at SMTP time. **Omitting `-r` does NOT mean "never reject"** — this build still rejects flagged spam at a low default threshold (a GTUBE test will get `550 Blocked by SpamAssassin`). To get pure tag-only behaviour, set the threshold absurdly high (`-r 999999`) so nothing ever reaches it. Do **not** use `-r -1` — that means "reject anything tagged as spam."
|
||||
|
||||
- `-a` — skip messages on **authenticated** connections, so your own outbound/submission mail isn't scanned or tagged.
|
||||
|
||||
## 3. Socket permissions (so Postfix can connect)
|
||||
|
||||
The socket is created `0770 sa-milt:sa-milt` only if you widen the unit's umask; by default it's `0755` and Postfix (running as `postfix`) can't write to it. Two steps:
|
||||
|
||||
```bash
|
||||
# 1. Let the socket be group-accessible
|
||||
sudo install -d /etc/systemd/system/spamass-milter.service.d
|
||||
printf '[Service]\nUMask=0007\n' | sudo tee /etc/systemd/system/spamass-milter.service.d/socket-perms.conf
|
||||
|
||||
# 2. Put postfix in the sa-milt group, then RESTART postfix (group is read at start)
|
||||
sudo usermod -aG sa-milt postfix
|
||||
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now spamass-milter
|
||||
```
|
||||
|
||||
Verify: `sudo -u postfix test -w /run/spamass-milter/spamass-milter.sock && echo OK`.
|
||||
|
||||
## 4. Wire into Postfix
|
||||
|
||||
Append the milter **alongside** OpenDKIM — don't replace it. Inbound (`smtpd`) gets both; local-injected mail (`non_smtpd`) stays DKIM-only.
|
||||
|
||||
```bash
|
||||
postconf -e 'smtpd_milters = local:/run/opendkim/opendkim.sock unix:/run/spamass-milter/spamass-milter.sock'
|
||||
postconf -e 'milter_default_action = accept' # if SA is down, accept the mail — never defer/bounce
|
||||
sudo systemctl restart postfix # restart (not reload) to pick up the new group
|
||||
```
|
||||
|
||||
`milter_default_action = accept` is important: if the milter ever hiccups, mail still flows.
|
||||
|
||||
## 5. Site-wide Bayes DB
|
||||
|
||||
Put the Bayes DB in one fixed location so the scan path and your training script share it. In `/etc/mail/spamassassin/local.cf`:
|
||||
|
||||
```
|
||||
use_bayes 1
|
||||
bayes_auto_learn 1
|
||||
bayes_path /var/lib/spamassassin/bayes/bayes
|
||||
bayes_file_mode 0660
|
||||
```
|
||||
|
||||
Create the directory owned by the **scanning user** (`sa-milt`), under `/var/lib/spamassassin` so it inherits the correct SELinux type (`spamd_var_lib_t`):
|
||||
|
||||
```bash
|
||||
sudo install -d -m 2770 -o sa-milt -g sa-milt /var/lib/spamassassin/bayes
|
||||
sudo restorecon -Rv /var/lib/spamassassin/bayes
|
||||
sudo systemctl restart spamassassin
|
||||
```
|
||||
|
||||
The `2770` setgid + `bayes_file_mode 0660` means whether the DB is written by `spamd` (as `sa-milt`) or by `sa-learn` (as `root`, from a training script), all parties can read and write it.
|
||||
|
||||
## 6. File spam into Junk (Dovecot Sieve)
|
||||
|
||||
A global Sieve before-script files anything SpamAssassin flagged. `/etc/dovecot/sieve/global/spam-to-junk.sieve`:
|
||||
|
||||
```sieve
|
||||
require ["fileinto", "mailbox"];
|
||||
if anyof (header :contains "X-Spam-Flag" "YES", header :contains "X-Spam-Status" "Yes") {
|
||||
fileinto :create "Junk";
|
||||
stop;
|
||||
}
|
||||
```
|
||||
|
||||
Register it as a global script in `dovecot.conf` (e.g. `sieve_before = /etc/dovecot/sieve/global/spam-to-junk.sieve`) and restart Dovecot.
|
||||
|
||||
## 7. Training the Bayes filter
|
||||
|
||||
SpamAssassin's Bayes only starts scoring once it has learned **≥ 200 spam AND ≥ 200 ham** (`bayes_min_spam_num` / `bayes_min_ham_num`). Train from your Maildir folders with `sa-learn`. **Run it as `root`** — root can read every user's Maildir *and* write the Bayes DB.
|
||||
|
||||
```bash
|
||||
# Spam — your Junk folder(s) and any dedicated spam mailbox
|
||||
sa-learn --spam /var/vmail/example.com/user/.Junk/{cur,new}
|
||||
|
||||
# Ham — Sent + Inbox (known-good)
|
||||
sa-learn --ham /var/vmail/example.com/user/{cur,new}
|
||||
sa-learn --ham /var/vmail/example.com/user/.Sent/{cur,new}
|
||||
|
||||
sa-learn --sync
|
||||
sa-learn --dump magic | grep -E 'nspam|nham'
|
||||
```
|
||||
|
||||
`bayes_path` is read from `local.cf`, so no `--dbpath` is needed.
|
||||
|
||||
> [!tip] Keep spam and ham roughly balanced
|
||||
> Bayes accuracy drops when one corpus dwarfs the other (aim for within ~3:1). Don't dump a 90,000-message archive of ham against a few hundred spam — it biases everything toward "ham" and spam slips through. Use Sent + recent Inbox for ham, not your entire archive.
|
||||
|
||||
> [!warning] Train manually, not from cron — unless your folders are always clean
|
||||
> `sa-learn` learns whatever is *in* the folder. If a spam slips into the Inbox, or you haven't yet rescued a false-positive out of Junk, an unattended cron run will mislearn it. Prefer a manual script you run **after** triaging Junk/Inbox. (`sa-learn` is idempotent and re-classifies on re-run, so a mistake is fixable: move the message to the right folder and run again.)
|
||||
|
||||
## 8. Test
|
||||
|
||||
Send a [GTUBE](https://spamassassin.apache.org/gtube/) probe through port 25 (unauthenticated) and a normal message:
|
||||
|
||||
```bash
|
||||
# from a host that can reach :25 — GTUBE scores ~1000
|
||||
printf 'Subject: gtube\n\nXJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X\n' \
|
||||
| sendmail -f test@example.org user@example.com
|
||||
```
|
||||
|
||||
Confirm in `/var/log/maillog` that `spamd` scanned it (`result: Y …`), the message was **delivered** (no `milter-reject`), it landed in `.Junk`, and the stored message has `X-Spam-Flag: YES`.
|
||||
|
||||
## Gotchas recap
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---|---|---|
|
||||
| Spam gets `550 Blocked by SpamAssassin` (you wanted Junk) | spamass-milter rejects at a default threshold | `-r 999999` for tag-only |
|
||||
| Postfix can't reach the milter socket | socket `0755`, postfix not in `sa-milt` group | `UMask=0007` drop-in + `usermod -aG sa-milt postfix` + restart postfix |
|
||||
| `sa-learn` trains but `spamd` doesn't use it | per-user vs site Bayes mismatch | set `bayes_path` in `local.cf` (site-wide) |
|
||||
| Bayes never scores (`BAYES_*` absent) | below the 200/200 learn floor | train more, keep spam/ham balanced |
|
||||
| Your own outbound mail gets tagged | scanning authenticated mail | `-a` flag |
|
||||
| AVC denials on the Bayes DB (SELinux) | DB outside `/var/lib/spamassassin` | keep it under that path (`spamd_var_lib_t`) + `restorecon` |
|
||||
|
||||
## See also
|
||||
|
||||
- [[selinux-dovecot-vmail-context|SELinux: Fixing Dovecot Mail Spool Context (/var/vmail)]]
|
||||
- [[linux-server-hardening-checklist|Linux Server Hardening Checklist]] (basic `sa-learn` section)
|
||||
|
|
@ -98,6 +98,29 @@ ausearch -m avc -ts recent | grep dovecot
|
|||
|
||||
No output = no new denials.
|
||||
|
||||
## Variant: a Freshly-Rebuilt Box Left in Permissive Mode
|
||||
|
||||
If a server was rebuilt or migrated and came up **Permissive** (check `getenforce`), the symptom flips: mail works fine, but `/var/log/audit/audit.log` quietly fills with thousands of `dovecot_t → var_t` denials that *would* break IMAP/LMTP the instant you switch to Enforcing. The mailstore was created or `rsync`'d onto `/var/vmail` with no fcontext rule, so it defaulted to `var_t`.
|
||||
|
||||
Apply the relabel above first, then flip to Enforcing **only after** verifying zero new denials:
|
||||
|
||||
```bash
|
||||
MARK=$(date +%H:%M:%S)
|
||||
# ...deliver a test message + do an IMAP login...
|
||||
ausearch -m avc -ts "$MARK" | grep -c denied # expect 0
|
||||
setenforce 1
|
||||
sed -i 's/^SELINUX=permissive/SELINUX=enforcing/' /etc/selinux/config
|
||||
```
|
||||
|
||||
**Companion denial:** a Postfix virtual-mailbox server that looks up recipients in MySQL also trips `postfix_cleanup_t` reading `/etc/my.cnf*` (`mysqld_etc_t`). Allow it with a small local module:
|
||||
|
||||
```bash
|
||||
ausearch -m avc -c cleanup | audit2allow -M local_postfix_mysql
|
||||
semodule -i local_postfix_mysql.pp
|
||||
```
|
||||
|
||||
See also [[postfix-spamassassin-bayes-spam-filtering|Inbound Spam Filtering]] — the SpamAssassin Bayes DB belongs under `/var/lib/spamassassin` (`spamd_var_lib_t`) for the same labeling reason.
|
||||
|
||||
## Key Notes
|
||||
|
||||
- **One rule is enough** — `"/var/vmail(/.*)?"` with `mail_spool_t` covers every file and directory under `/var/vmail`, including all `tmp/` subdirectories.
|
||||
|
|
|
|||
|
|
@ -42,6 +42,7 @@ updated: 2026-05-15T09:00
|
|||
* [Mastodon — The `--prune-profiles` Trap and How to Recover](02-selfhosting/services/mastodon-prune-profiles-trap.md)
|
||||
* [Mastodon on S3 — Silent Upload Failures (BucketOwnerEnforced/ACLs)](02-selfhosting/services/mastodon-s3-acl-upload-failures.md)
|
||||
* [Ghost Email Configuration with Mailgun](02-selfhosting/services/ghost-smtp-mailgun-setup.md)
|
||||
* [Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes](02-selfhosting/services/postfix-spamassassin-bayes-spam-filtering.md)
|
||||
* [Claude Code Remote Control — Mobile Access to a Persistent Host Session](02-selfhosting/services/claude-code-remote-control.md)
|
||||
* [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md)
|
||||
* [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md)
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue