wiki: spam filtering — add Pigeonhole 2.4 syntax, REDIRECT-to-junk pattern, weekly timer
Three updates to the inbound spam filtering guide, all driven by the 2026-06-04
majormail-hetzner Phase 6 cutover and follow-up tuning:
1. Section 6 (Dovecot Sieve): warn explicitly that `plugin/sieve_before` was
dropped in Pigeonhole 2.4 and silently does nothing — no startup warning,
spam just keeps landing in INBOX. The 2.4 replacement is a top-level
`sieve_script <name> { type = before; path = …; }` block. Also note the
Fedora-flat-dovecot.conf pitfall (some packagings ship dovecot.conf
without `!include conf.d/*.conf`, so the block has to live in the main
file directly). Added a `sievec` compile step.
2. New §6b: route spam to a separate `junk@` mailbox via Postfix cleanup
`header_checks` REDIRECT. This makes spam invisible to the user's
mailbox entirely — Spark/IDLE-based clients don't push-notify because
the message never reaches the subscribed mailbox at all. Includes the
`regexp:` vs `pcre:` map-type tip (use regexp on stock Fedora to avoid
the postfix-pcre package dependency).
3. New §7a: weekly systemd timer for sa-learn. The §7 warning about
"don't run sa-learn from cron unless folders are clean" is correct as
the safe default — but when you adopt the §6b REDIRECT-to-junk@
pattern, the junk@ mailbox is pure spam by design and a weekly
`--spam`/`--ham`/`--sync`/`--force-expire` chain becomes safe and
useful. Full unit templates included.
Gotchas table gains four entries:
- Pigeonhole 2.4 silent breakage of plugin/sieve_before
- postfix-pcre vs regexp map type confusion
- Why sieve fileinto Junk still pushes a Spark notification
- Why local `sendmail` injection doesn't trigger the REDIRECT (smtpd
milters skip sendmail-injected mail, so X-Spam-Flag isn't added)
All changes match what's now codified in the `majormail` Ansible role
(commit 7a8b9eb in MajorAnsible).
This commit is contained in:
parent
2e58c4625c
commit
5260548caa
1 changed files with 107 additions and 2 deletions
|
|
@ -5,7 +5,7 @@ category: services
|
|||
tags: [postfix, dovecot, spamassassin, spamass-milter, bayes, spam, sieve, fedora, email, selinux]
|
||||
status: published
|
||||
created: 2026-06-04
|
||||
updated: 2026-06-04
|
||||
updated: 2026-06-05
|
||||
---
|
||||
# Inbound Spam Filtering: spamass-milter + SpamAssassin Bayes on Postfix/Dovecot
|
||||
|
||||
|
|
@ -116,7 +116,59 @@ if anyof (header :contains "X-Spam-Flag" "YES", header :contains "X-Spam-Status"
|
|||
}
|
||||
```
|
||||
|
||||
Register it as a global script in `dovecot.conf` (e.g. `sieve_before = /etc/dovecot/sieve/global/spam-to-junk.sieve`) and restart Dovecot.
|
||||
Register it as a global before-script in `dovecot.conf` (NOT under `plugin {}` on Pigeonhole 2.4+ — see warning below), then compile and restart Dovecot:
|
||||
|
||||
```bash
|
||||
sievec /etc/dovecot/sieve/global/spam-to-junk.sieve # produces .svbin
|
||||
systemctl restart dovecot
|
||||
```
|
||||
|
||||
> [!warning] Pigeonhole 2.4 dropped `plugin/sieve_before` — it silently does nothing
|
||||
> Before Dovecot/Pigeonhole 2.4, the canonical way to register a global before-script was:
|
||||
>
|
||||
> ```
|
||||
> plugin {
|
||||
> sieve_before = /etc/dovecot/sieve/global/spam-to-junk.sieve
|
||||
> }
|
||||
> ```
|
||||
>
|
||||
> On **Dovecot 2.4+**, that setting is gone and **silently ignored** — no warning at start-up, the script never runs, and your X-Spam-Flag mail just lands in INBOX wondering why nothing files it. The 2.4 replacement is a top-level `sieve_script` block (not inside `plugin {}`):
|
||||
>
|
||||
> ```
|
||||
> sieve_script spam_before {
|
||||
> type = before
|
||||
> path = /etc/dovecot/sieve/global/spam-to-junk.sieve
|
||||
> }
|
||||
> ```
|
||||
>
|
||||
> Verify with `doveconf -n | grep -A2 spam_before`. If it doesn't appear, dovecot.conf isn't reading your file — check that `!include conf.d/*.conf` exists in dovecot.conf (some Fedora rebuilds ship a flat dovecot.conf without it; the block has to live in dovecot.conf directly).
|
||||
|
||||
## 6b. (Optional) Route spam to a separate mailbox — silence iOS push notifications
|
||||
|
||||
`fileinto :create "Junk"` moves spam to the user's `.Junk` folder, but the user's IMAP session still sees a new-message event in INBOX (briefly, before sieve moves it) or in Junk (depending on client subscriptions). For clients with IMAP IDLE + push, that's a notification you don't want — e.g. Spark on iPhone/iPad fires APNS on any new message touching a subscribed folder.
|
||||
|
||||
To make spam **invisible to the user's mailbox entirely**, REDIRECT the envelope at Postfix `cleanup` (after the milter adds `X-Spam-Flag`, before LMTP delivery) so spam lands in a separate `junk@` mailbox the user doesn't subscribe to:
|
||||
|
||||
```bash
|
||||
# /etc/postfix/cleanup_header_checks
|
||||
/^X-Spam-Flag:[[:space:]]+YES/ REDIRECT junk@example.com
|
||||
```
|
||||
|
||||
```bash
|
||||
postconf -e 'header_checks = regexp:/etc/postfix/cleanup_header_checks'
|
||||
systemctl reload postfix
|
||||
```
|
||||
|
||||
> [!tip] Use `regexp:`, not `pcre:`, on stock Fedora
|
||||
> `pcre:` requires the `postfix-pcre` package. `regexp:` is built into postfix and supports POSIX extended regex — use `[[:space:]]+` for whitespace and `\\\\` for backslash. The patterns in cleanup_header_checks are simple enough that regexp is plenty.
|
||||
|
||||
The Sieve from §6 still runs as a safety net for any tagged message that escapes the cleanup REDIRECT (e.g. a message addressed to the junk@ mailbox itself, or aliases not covered by the REDIRECT rule). Defense in depth.
|
||||
|
||||
Train Bayes from the `junk@` Maildir instead of (or in addition to) per-user Junk folders:
|
||||
|
||||
```bash
|
||||
sa-learn --spam /var/vmail/example.com/junk/{cur,new}
|
||||
```
|
||||
|
||||
## 7. Training the Bayes filter
|
||||
|
||||
|
|
@ -142,6 +194,55 @@ sa-learn --dump magic | grep -E 'nspam|nham'
|
|||
> [!warning] Train manually, not from cron — unless your folders are always clean
|
||||
> `sa-learn` learns whatever is *in* the folder. If a spam slips into the Inbox, or you haven't yet rescued a false-positive out of Junk, an unattended cron run will mislearn it. Prefer a manual script you run **after** triaging Junk/Inbox. (`sa-learn` is idempotent and re-classifies on re-run, so a mistake is fixable: move the message to the right folder and run again.)
|
||||
|
||||
### 7a. Weekly systemd timer (safe when junk@ is dedicated and INBOX is curated)
|
||||
|
||||
The warning above is the safe default. If you use the §6b REDIRECT-to-junk@ pattern, **the junk mailbox is pure spam by design** (only `X-Spam-Flag:YES` envelopes reach it), and your INBOX is curated by hand — the misclassification risk drops to near zero, and a weekly timer becomes both safe and useful. Add `--force-expire` to age out stale tokens so the Bayes corpus doesn't drift.
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/sa-learn-majormail.service
|
||||
[Unit]
|
||||
Description=SpamAssassin Bayes training from majorshouse.com Maildir
|
||||
After=spamassassin.service
|
||||
Wants=spamassassin.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
Nice=10
|
||||
IOSchedulingClass=idle
|
||||
ExecStart=/usr/bin/sa-learn --spam --no-sync \
|
||||
/var/vmail/example.com/junk/cur \
|
||||
/var/vmail/example.com/junk/new
|
||||
ExecStart=/usr/bin/sa-learn --ham --no-sync \
|
||||
/var/vmail/example.com/user/cur \
|
||||
/var/vmail/example.com/user/new \
|
||||
/var/vmail/example.com/user/.Sent/cur \
|
||||
/var/vmail/example.com/user/.Sent/new
|
||||
ExecStart=/usr/bin/sa-learn --sync
|
||||
ExecStart=/usr/bin/sa-learn --force-expire
|
||||
```
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/sa-learn-majormail.timer
|
||||
[Unit]
|
||||
Description=Weekly SpamAssassin Bayes training + expiry
|
||||
|
||||
[Timer]
|
||||
OnCalendar=Sun 04:15
|
||||
Persistent=true
|
||||
RandomizedDelaySec=20min
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
```
|
||||
|
||||
```bash
|
||||
systemctl daemon-reload
|
||||
systemctl enable --now sa-learn-majormail.timer
|
||||
systemctl list-timers sa-learn-majormail.timer
|
||||
```
|
||||
|
||||
`Persistent=true` runs the missed job on next boot if the host was off at 04:15. `--force-expire` is a no-op until SA's expiry heuristic decides tokens are due (typically every few weeks for the default `bayes_expiry_max_db_size`).
|
||||
|
||||
## 8. Test
|
||||
|
||||
Send a [GTUBE](https://spamassassin.apache.org/gtube/) probe through port 25 (unauthenticated) and a normal message:
|
||||
|
|
@ -164,6 +265,10 @@ Confirm in `/var/log/maillog` that `spamd` scanned it (`result: Y …`), the mes
|
|||
| Bayes never scores (`BAYES_*` absent) | below the 200/200 learn floor | train more, keep spam/ham balanced |
|
||||
| Your own outbound mail gets tagged | scanning authenticated mail | `-a` flag |
|
||||
| AVC denials on the Bayes DB (SELinux) | DB outside `/var/lib/spamassassin` | keep it under that path (`spamd_var_lib_t`) + `restorecon` |
|
||||
| `plugin/sieve_before` does nothing — spam keeps reaching INBOX | Pigeonhole 2.4 silently dropped that setting | use the top-level `sieve_script <name> { type = before; path = ...; }` block instead |
|
||||
| `postfix reload` fails: `unsupported dictionary type: pcre` | `pcre:` map requires `postfix-pcre` package | install it, OR use `regexp:` (built-in POSIX) |
|
||||
| Sieve `fileinto Junk` still notifies Spark/iOS | client subscribes to Junk; LMTP delivery briefly hits INBOX | REDIRECT envelope at Postfix cleanup (§6b) so the message never reaches the user's mailbox at all |
|
||||
| Local `sendmail` test doesn't trigger REDIRECT | `sendmail` bypasses smtpd milters → no `X-Spam-Flag` added | inject through SMTP :25 (e.g. swaks) OR pre-set the header in the test message |
|
||||
|
||||
## See also
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue