majorwiki/05-troubleshooting/ansible-ubuntu-reboot-detection-kernel-mismatch.md
majorlinux 52ca8a0413 wiki: batch update — 4 new articles + 4 updates
New articles:
- Postfix SendGrid TLS handshake failure (port 465 vs 587)
- Plex transcoding troubleshooting
- Ansible Ubuntu reboot detection kernel mismatch
- WSL2 PyTorch checkpoint Windows filesystem deadlock

Updated:
- AWS S3 cost management (expanded)
- Network overview (IP updates)
- HEVC VAAPI batch encode (progress + fixes)
- SUMMARY.md (new entries)
2026-05-25 13:55:10 -04:00

4.3 KiB

title domain category tags status created updated
Ansible: Ubuntu Reboot Detection Misses Kernel Upgrades troubleshooting ansible
ansible
ubuntu
kernel
reboot
needrestart
apt
published 2026-05-19 2026-05-19

Ansible: Ubuntu Reboot Detection Misses Kernel Upgrades

Problem

update.yml runs across the Ubuntu fleet, a kernel package is upgraded, but the executive summary reports No reboot needed — even though a reboot is genuinely required. Running uname -r on the host confirms it's still on the old kernel.

Example: majortoot had linux-image-6.8.0-117-generic installed on May 16 after a Tailscale update triggered needrestart, but the playbook kept reporting clean.

Root Cause

The standard check for Ubuntu reboot state is:

- name: Check if a reboot is required for Ubuntu servers
  ansible.builtin.stat:
    path: /var/run/reboot-required
  register: ubuntu_reboot_flag

/var/run/reboot-required is written by update-notifier-common's notify-reboot-required script, called by /etc/kernel/postinst.d/update-notifier when a kernel package is installed via apt.

The problem is needrestart. It runs after every apt invocation via a DPkg::Post-Invoke hook (apt-pinvoke -m u). In unattended mode (-m u), needrestart detects the pending kernel upgrade and calls announce_ver() in NeedRestart::UI::Ubuntu — but that function only prints to stdout. It does not call _write_reboot_file(). Only announce_ucode() (microcode upgrades) calls _write_reboot_file().

So the sequence is:

  1. apt installs kernel → notify-reboot-required creates /run/reboot-required
  2. Some later apt run (e.g. Ansible installs Tailscale) → needrestart -m u runs → detects kernel mismatch → calls announce_ver() → prints to stdout (suppressed in Ansible) → does not recreate the sentinel file
  3. Next Ansible run: stat check finds no file → reports No reboot needed

The /run filesystem is tmpfs and clears on reboot, but the sentinel file can disappear between reboots any time needrestart runs without recreating it.

Fix — Dual Check in update.yml

Add a parallel kernel comparison task after the existing stat check:

- name: Check running kernel vs installed kernel (Ubuntu)
  ansible.builtin.shell: |
    RUNNING=$(uname -r)
    INSTALLED=$(dpkg -l 'linux-image-[0-9]*-generic' 2>/dev/null \
      | awk '/^ii/{print $2}' \
      | sed 's/linux-image-//' \
      | sort -V | tail -1)
    if [ -n "$INSTALLED" ] && [ "$RUNNING" != "$INSTALLED" ]; then
      echo "KERNEL_MISMATCH"
    fi
  register: kernel_mismatch_check
  changed_when: false
  when: ansible_facts['os_family'] == "Debian"

Then update the host_summary Jinja2 template to OR both conditions:

{%- if ansible_facts['os_family'] == 'Debian' and (
    (ubuntu_reboot_flag is defined and ubuntu_reboot_flag.stat is defined and ubuntu_reboot_flag.stat.exists)
    or
    (kernel_mismatch_check is defined and 'KERNEL_MISMATCH' in (kernel_mismatch_check.stdout | default('')))
  ) -%}
  {%- set _ = parts.append('REBOOT REQUIRED') -%}

Common Mistake — Comparing the Wrong dpkg Field

An initial version of this fix used $3 (the package version) and cut:

# WRONG — version field never matches uname -r
INSTALLED=$(dpkg -l 'linux-image-*-generic' | awk '/^ii/{print $3}' | sort -V | tail -1 | cut -d- -f1-4)
Field Example value
dpkg $3 (version) after cut 6.8.0-57.59
uname -r 6.8.0-57-generic

These formats never match. Every Ubuntu host permanently reports KERNEL_MISMATCH. Always use the name column ($2), strip the linux-image- prefix, and compare directly to uname -r.

Also use linux-image-[0-9]*-generic (not *-generic) to exclude the linux-image-generic meta-package from the sort.

Verification

Run against a known-pending host before and after reboot:

ansible-playbook update.yml --limit majortoot

Before reboot: majortoot: 0 pkg(s) upgraded | REBOOT REQUIRED
After reboot: majortoot: 0 pkg(s) upgraded | No reboot needed