majorwiki/05-troubleshooting/ansible-check-mode-false-positives.md
majorlinux 4126656c05 wiki: update fail2ban digest + netdata docker health + 3 new articles
- fail2ban-digest-mode-fleet: recidive-only email model, sshd now silent,
  defaults-debian.conf gotcha added
- netdata-docker-health-alarm-tuning: 30m/10m config, tuning history table
- New: wp-fail2ban-logpath-debian-ubuntu, lora-adapter-gguf-conversion-fails,
  tailscale-status-json-hostname-localhost-ios
- Various article updates and nav index refreshes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-02 14:58:07 -04:00

154 lines
5.9 KiB
Markdown

---
title: Ansible Check Mode False Positives in Verify/Assert Tasks
domain: selfhosting
category: troubleshooting
tags:
- ansible
- check-mode
- dry-run
- assert
- handlers
- troubleshooting
status: published
created: 2026-04-18
updated: 2026-04-30T05:21
---
# Ansible Check Mode False Positives in Verify/Assert Tasks
## The Problem
`ansible-playbook --check` (dry-run mode) reports failures on verify and assert tasks that depend on handler-triggered side effects — even when the playbook is correct and would succeed on a real run.
**Symptom:** Running `--check` produces errors like:
```
TASK [Assert hardened settings are active] ***
fatal: [host]: FAILED! => {
"assertion": "'permitrootlogin without-password' in sshd_effective.stdout",
"msg": "One or more SSH hardening settings not effective"
}
```
But a real run (`ansible-playbook` without `--check`) succeeds cleanly.
## Why It Happens
In check mode, Ansible simulates tasks but **does not execute handlers**. This means:
1. A config file task reports `changed` (it would deploy the file)
2. The handler (`reload sshd`, `reload firewalld`, etc.) is **never fired**
3. A subsequent verify task runs `sshd -T` or `ufw status verbose` against the **current live state** (pre-change)
4. The assert compares the current state against the expected post-change state and fails
The verify task is reading reality accurately — the change hasn't happened yet — but the failure is misleading. It suggests the playbook is broken when it's actually correct.
## The Fix
Guard verify and assert tasks that depend on handler side effects with `when: not ansible_check_mode`:
```yaml
- name: Verify effective SSH settings post-reload
ansible.builtin.command:
cmd: sshd -T
register: sshd_effective
changed_when: false
when: not ansible_check_mode # sshd hasn't reloaded in check mode
- name: Assert hardened settings are active
ansible.builtin.assert:
that:
- "'permitrootlogin without-password' in sshd_effective.stdout"
- "'x11forwarding no' in sshd_effective.stdout"
fail_msg: "SSH hardening settings not effective — check for conflicting config"
when: not ansible_check_mode # result would be pre-change state
```
This skips the verify/assert during check mode (where they'd produce false failures) while keeping them active on real runs (where they catch actual misconfigurations).
## When to Apply This Guard
Apply `when: not ansible_check_mode` to any task that:
- Reads the **active/effective state** of a service after a config change (`sshd -T`, `ufw status verbose`, `firewall-cmd --list-all`, `nginx -T`)
- **Asserts** that the post-change state matches expectations
- Depends on a **handler** having fired first (service reload, daemon restart)
Don't apply it to tasks that check pre-existing state (e.g., verifying a file exists before modifying it) — those are valid in check mode.
## Common Patterns
### SSH daemon verify
```yaml
- name: Verify effective sshd settings
ansible.builtin.command: sshd -T
register: sshd_out
changed_when: false
when: not ansible_check_mode
- name: Assert sshd hardening active
ansible.builtin.assert:
that:
- "'maxauthtries 3' in sshd_out.stdout"
when: not ansible_check_mode
```
### UFW status verify
```yaml
- name: Show UFW status
ansible.builtin.command: ufw status verbose
register: ufw_status
changed_when: false
when: not ansible_check_mode
- name: Confirm default deny incoming
ansible.builtin.assert:
that:
- "'Default: deny (incoming)' in ufw_status.stdout"
when: not ansible_check_mode
```
### nginx config verify
```yaml
- name: Test nginx config
ansible.builtin.command: nginx -t
changed_when: false
when: not ansible_check_mode
```
## Related pattern: `command` / `shell` skipped in check mode
The inverse problem shows up when a playbook registers output from `ansible.builtin.command` (or `shell`) and uses that output in a downstream conditional or `set_fact`. In check mode, `command` is **skipped by default** — the registered variable comes back with no `stdout`, and any `in` / containment check against it silently evaluates to False.
Saw this 2026-04-19 while writing `fix_ebtables_usrmerge.yml`. The play queried `update-alternatives --display ebtables` to detect whether a host ran the `nft` or `legacy` backend, then branched on that fact. Under `--check`, the query was skipped, the fact defaulted to `legacy` on every host, and the next task's existence check failed on the nft hosts (`/usr/bin/ebtables-legacy not found`). A real run was fine — but `--check` output looked like the playbook was broken.
**Fix:** force the detection task to run even in check mode, since it's a read-only query with no side effects.
```yaml
- name: Query current ebtables alternative
ansible.builtin.command: update-alternatives --display ebtables
register: alt_query
changed_when: false
failed_when: false
check_mode: false # force execution in --check so downstream conditionals see real data
```
Apply this to any `command`/`shell` task whose output feeds a `when:`, `set_fact`, or similar logic. Only safe when the task is genuinely read-only.
## Trade-off
Guarding with `when: not ansible_check_mode` means check mode won't validate these assertions. The benefit — no false failures — outweighs the gap because:
- Check mode is showing you what *would* change, not whether the result is valid
- Real runs still assert and will catch actual misconfigurations
- The alternative (failing check runs) erodes trust in `--check` output
If you need to verify the effective post-change state in check mode, consider splitting the playbook into a deploy pass and a separate verify-only playbook run without `--check`.
## See Also
- [ssh-hardening-ansible-fleet](../02-selfhosting/security/ssh-hardening-ansible-fleet.md)
- [ufw-firewall-management](../02-selfhosting/security/ufw-firewall-management.md)
- [ansible-getting-started](../01-linux/shell-scripting/ansible-getting-started.md)