majorwiki/05-troubleshooting/ansible-regex-search-set-fact-capture-group.md
Marcus Summers ca123b0312 wiki: add troubleshooting article — Ansible regex_search capture group fails in set_fact
Documents the gotcha hit during the 2026-05-06 update.yml refactor:
the second-positional-argument back-reference form of regex_search
('\1') doesn't reliably select capture groups when used inside
set_fact. The fix is to match the broader substring and use
.split()[0] (or [-1], etc.) to peel off the value, with a default()
bridge for the no-match case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 08:28:21 -04:00

3.6 KiB

title domain category tags status created updated
Ansible regex_search — capture-group argument doesn't work in set_fact troubleshooting general
ansible
jinja
regex
set_fact
gotcha
published 2026-05-06 2026-05-06

Ansible regex_search — capture-group argument doesn't work in set_fact

Problem

You want to extract a number from a registered command's stdout — e.g. the package count from a dnf or apt upgrade — and stash it in a fact. The natural-looking regex_search('pattern', '\1') form fails or produces an empty string when used inside set_fact:

- name: Capture package count                # ❌ does not behave as expected
  ansible.builtin.set_fact:
    pkg_count: "{{ apt_upgrade_result.stdout | regex_search('([0-9]+) upgraded', '\\1') }}"

You'll see one of:

  • An empty pkg_count (the filter ran but the back-reference returned nothing in this context)
  • A Jinja error about argument arity if the syntax is slightly off
  • The whole matched substring instead of just the captured group

Root cause

In set_fact templating, the second-positional-argument form of regex_search (the back-reference '\1' you've seen in tutorials) doesn't reliably select capture groups. The filter is happiest returning the full match. Capture-group selection works in some contexts (e.g. vars: blocks, certain Jinja invocations) but not consistently inside set_fact, which makes "copy this snippet from the docs" fail intermittently.

Fix — match the broader pattern, then split

Stop fighting the back-reference. Use regex_search to grab a string that contains the value you want, then peel it apart with plain Python string ops:

- name: Capture package count                # ✅ works in set_fact
  ansible.builtin.set_fact:
    pkg_count: "{{ (apt_upgrade_result.stdout | regex_search('[0-9]+ upgraded') | default('0')).split()[0] }}"

What this does:

  1. regex_search('[0-9]+ upgraded') returns the matching substring (e.g. "7 upgraded") or None on no match.
  2. default('0') turns the None case into the string "0" so the next step always has something to operate on.
  3. .split()[0] keeps just the number.

The result ("7") is a string — cast with | int if you need arithmetic.

Where this comes up in MajorAnsible

The update.yml executive-summary task uses this pattern to pull package counts out of apt_upgrade_result.stdout and dnf_upgrade_result.stdout so each host can print one tidy line:

majorhome: 7 pkg(s) upgraded | No reboot needed | 2 active screen(s)
majormail: 14 pkg(s) upgraded | REBOOT REQUIRED | Snapshot taken
majorlab:  0 pkg(s) upgraded | No reboot needed

The summary line is built with a Jinja parts array joined with ' | ' so segments that don't apply (no snapshot, no screens) drop out cleanly without leaving trailing separators.

Quick checks if this still misbehaves

  • Confirm the source variable. Ansible 2.x sometimes returns stdout as result.stdout and sometimes as result.stdout_lines; the regex_search filter wants a string, not a list. Use .stdout (or .stdout | join('\n') for a multi-line list).
  • Escape your backslashes. In YAML strings, \d needs to be written \\d or wrapped in single quotes: '(\d+) upgraded'.
  • Always provide a default. regex_search returns None on miss, which will explode .split()[0]. The | default('0') bridge is mandatory in production playbooks where some hosts will legitimately have zero upgrades.