---
title: "Nextcloud AIO Container Unhealthy for 20 Hours After Nightly Update"
domain: troubleshooting
category: docker
tags: [nextcloud, docker, healthcheck, netdata, php-fpm, aio]
status: published
created: 2026-03-28
updated: 2026-03-28
---

# Nextcloud AIO Container Unhealthy for 20 Hours After Nightly Update

## Symptom

Netdata alert `docker_nextcloud_unhealthy` fired on majorlab and stayed in Warning for 20 hours. The `nextcloud-aio-nextcloud` container was running but its Docker healthcheck kept failing. No user-facing errors were visible in `nextcloud.log`.

## Investigation

### Timeline (2026-03-27, all UTC)

| Time | Event |
|---|---|
| 04:00 | Nightly backup script started, mastercontainer update kicked off |
| 04:03 | `nextcloud-aio-nextcloud` container recreated |
| 04:05 | Backup finished |
| 07:25 | Mastercontainer logged "Initial startup of Nextcloud All-in-One complete!" (3h20m delay) |
| 10:22 | First entry in `nextcloud.log` (deprecation warnings only — no errors) |
| 04:00 (Mar 28) | Next nightly backup replaced the container; new container came up healthy in ~25 minutes |

### Key findings

- **No image update** — the container image dated to Feb 26, so this was not caused by a version change.
- **No app-level errors** — `nextcloud.log` contained only `files_rightclick` deprecation warnings (level 3). No level 2/4 entries.
- **PHP-FPM never stabilized** — the healthcheck (`/healthcheck.sh`) tests `nc -z 127.0.0.1 9000` (PHP-FPM). The container was running but FPM wasn't responding to the port check.
- **6-hour log gap** — no `nextcloud.log` entries between container start (04:03) and first log (10:22), suggesting the AIO init scripts (occ upgrade, app updates, cron jobs) ran for hours before the app became partially responsive.
- **RestartCount: 0** — the container never restarted on its own. It sat there unhealthy for the full 20 hours.
- **Disk space fine** — 40% used on `/`.

### Healthcheck details

```bash
#!/bin/bash
# /healthcheck.sh inside nextcloud-aio-nextcloud
nc -z "$POSTGRES_HOST" "$POSTGRES_PORT" || exit 0  # postgres down = pass (graceful)
nc -z 127.0.0.1 9000 || exit 1                      # PHP-FPM down = fail
```

If PostgreSQL is unreachable, the check passes (exits 0). The only failure path is PHP-FPM not listening on port 9000.

## Root Cause

The AIO nightly update cycle recreated the container, but the startup/migration process hung or ran extremely long, preventing PHP-FPM from fully initializing. The container sat in this state for 20 hours with no self-recovery mechanism until the next nightly cycle replaced it.

The exact migration or occ command that stalled could not be confirmed — the old container's entrypoint logs were lost when the Mar 28 backup cycle replaced it.

## Fix

Two changes deployed on 2026-03-28:

### 1. Dedicated Netdata alarm with lenient window

Split `nextcloud-aio-nextcloud` into its own Netdata alarm (`docker_nextcloud_unhealthy`) with a 10-minute lookup and 10-minute delay, separate from the general container alarm. See [Tuning Netdata Docker Health Alarms](../../02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md).

### 2. Watchdog cron for auto-restart

Deployed `/etc/cron.d/nextcloud-health-watchdog` on majorlab:

```bash
*/15 * * * * root docker inspect --format={{.State.Health.Status}} nextcloud-aio-nextcloud 2>/dev/null | grep -q unhealthy && [ "$(docker inspect --format={{.State.StartedAt}} nextcloud-aio-nextcloud | xargs -I{} date -d {} +\%s)" -lt "$(date -d "1 hour ago" +\%s)" ] && docker restart nextcloud-aio-nextcloud && logger -t nextcloud-watchdog "Restarted unhealthy nextcloud-aio-nextcloud"
```

- Checks every 15 minutes
- Only restarts if the container has been running >1 hour (avoids interfering with normal startup)
- Logs to syslog: `journalctl -t nextcloud-watchdog`

This caps future unhealthy outages at ~1 hour instead of persisting until the next nightly cycle.

## See Also

- [Tuning Netdata Docker Health Alarms](../../02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md)
- [Debugging Broken Docker Containers](../../02-selfhosting/docker/debugging-broken-docker-containers.md)
- [Docker Healthchecks](../../02-selfhosting/docker/docker-healthchecks.md)