New articles: - Postfix SendGrid TLS handshake failure (port 465 vs 587) - Plex transcoding troubleshooting - Ansible Ubuntu reboot detection kernel mismatch - WSL2 PyTorch checkpoint Windows filesystem deadlock Updated: - AWS S3 cost management (expanded) - Network overview (IP updates) - HEVC VAAPI batch encode (progress + fixes) - SUMMARY.md (new entries)
106 lines
4.3 KiB
Markdown
106 lines
4.3 KiB
Markdown
---
|
|
title: "Ansible: Ubuntu Reboot Detection Misses Kernel Upgrades"
|
|
domain: troubleshooting
|
|
category: ansible
|
|
tags: [ansible, ubuntu, kernel, reboot, needrestart, apt]
|
|
status: published
|
|
created: 2026-05-19
|
|
updated: 2026-05-19
|
|
---
|
|
|
|
# Ansible: Ubuntu Reboot Detection Misses Kernel Upgrades
|
|
|
|
## Problem
|
|
|
|
`update.yml` runs across the Ubuntu fleet, a kernel package is upgraded, but the executive summary reports `No reboot needed` — even though a reboot is genuinely required. Running `uname -r` on the host confirms it's still on the old kernel.
|
|
|
|
Example: majortoot had `linux-image-6.8.0-117-generic` installed on May 16 after a Tailscale update triggered `needrestart`, but the playbook kept reporting clean.
|
|
|
|
## Root Cause
|
|
|
|
The standard check for Ubuntu reboot state is:
|
|
|
|
```yaml
|
|
- name: Check if a reboot is required for Ubuntu servers
|
|
ansible.builtin.stat:
|
|
path: /var/run/reboot-required
|
|
register: ubuntu_reboot_flag
|
|
```
|
|
|
|
`/var/run/reboot-required` is written by `update-notifier-common`'s `notify-reboot-required` script, called by `/etc/kernel/postinst.d/update-notifier` when a kernel package is installed via `apt`.
|
|
|
|
The problem is `needrestart`. It runs after every `apt` invocation via a `DPkg::Post-Invoke` hook (`apt-pinvoke -m u`). In **unattended mode** (`-m u`), needrestart detects the pending kernel upgrade and calls `announce_ver()` in `NeedRestart::UI::Ubuntu` — but that function only prints to stdout. It does **not** call `_write_reboot_file()`. Only `announce_ucode()` (microcode upgrades) calls `_write_reboot_file()`.
|
|
|
|
So the sequence is:
|
|
|
|
1. `apt` installs kernel → `notify-reboot-required` creates `/run/reboot-required` ✅
|
|
2. Some later `apt` run (e.g. Ansible installs Tailscale) → `needrestart -m u` runs → detects kernel mismatch → calls `announce_ver()` → prints to stdout (suppressed in Ansible) → **does not** recreate the sentinel file
|
|
3. Next Ansible run: stat check finds no file → reports `No reboot needed` ❌
|
|
|
|
The `/run` filesystem is tmpfs and clears on reboot, but the sentinel file can disappear between reboots any time needrestart runs without recreating it.
|
|
|
|
## Fix — Dual Check in update.yml
|
|
|
|
Add a parallel kernel comparison task after the existing stat check:
|
|
|
|
```yaml
|
|
- name: Check running kernel vs installed kernel (Ubuntu)
|
|
ansible.builtin.shell: |
|
|
RUNNING=$(uname -r)
|
|
INSTALLED=$(dpkg -l 'linux-image-[0-9]*-generic' 2>/dev/null \
|
|
| awk '/^ii/{print $2}' \
|
|
| sed 's/linux-image-//' \
|
|
| sort -V | tail -1)
|
|
if [ -n "$INSTALLED" ] && [ "$RUNNING" != "$INSTALLED" ]; then
|
|
echo "KERNEL_MISMATCH"
|
|
fi
|
|
register: kernel_mismatch_check
|
|
changed_when: false
|
|
when: ansible_facts['os_family'] == "Debian"
|
|
```
|
|
|
|
Then update the `host_summary` Jinja2 template to OR both conditions:
|
|
|
|
```jinja2
|
|
{%- if ansible_facts['os_family'] == 'Debian' and (
|
|
(ubuntu_reboot_flag is defined and ubuntu_reboot_flag.stat is defined and ubuntu_reboot_flag.stat.exists)
|
|
or
|
|
(kernel_mismatch_check is defined and 'KERNEL_MISMATCH' in (kernel_mismatch_check.stdout | default('')))
|
|
) -%}
|
|
{%- set _ = parts.append('REBOOT REQUIRED') -%}
|
|
```
|
|
|
|
## Common Mistake — Comparing the Wrong dpkg Field
|
|
|
|
An initial version of this fix used `$3` (the package version) and `cut`:
|
|
|
|
```bash
|
|
# WRONG — version field never matches uname -r
|
|
INSTALLED=$(dpkg -l 'linux-image-*-generic' | awk '/^ii/{print $3}' | sort -V | tail -1 | cut -d- -f1-4)
|
|
```
|
|
|
|
| Field | Example value |
|
|
|-------|--------------|
|
|
| `dpkg $3` (version) after cut | `6.8.0-57.59` |
|
|
| `uname -r` | `6.8.0-57-generic` |
|
|
|
|
These formats never match. Every Ubuntu host permanently reports `KERNEL_MISMATCH`. Always use the **name column (`$2`)**, strip the `linux-image-` prefix, and compare directly to `uname -r`.
|
|
|
|
Also use `linux-image-[0-9]*-generic` (not `*-generic`) to exclude the `linux-image-generic` meta-package from the sort.
|
|
|
|
## Verification
|
|
|
|
Run against a known-pending host before and after reboot:
|
|
|
|
```bash
|
|
ansible-playbook update.yml --limit majortoot
|
|
```
|
|
|
|
Before reboot: `majortoot: 0 pkg(s) upgraded | REBOOT REQUIRED`
|
|
After reboot: `majortoot: 0 pkg(s) upgraded | No reboot needed`
|
|
|
|
## Related
|
|
|
|
- [[ansible-regex-search-set-fact-capture-group]] — companion Jinja2 gotcha in the same `host_summary` task
|
|
- [[ansible-unattended-upgrades-fleet]] — managing the Ubuntu auto-upgrade stack
|
|
- [[ansible-check-mode-false-positives]] — another Ansible reporting quirk
|