Files
MajorWiki/02-selfhosting/monitoring/netdata-n8n-enriched-alerts.md

160 lines
6.0 KiB
Markdown

# Netdata → n8n Enriched Alert Emails
**Status:** Live across all MajorsHouse fleet servers as of 2026-03-21
Replaces Netdata's plain-text alert emails with rich HTML emails that include a plain-English explanation, a suggested remediation command, and a direct link to the relevant MajorWiki article.
---
## How It Works
```
Netdata alarm fires
→ custom_sender() in health_alarm_notify.conf
→ POST JSON payload to n8n webhook
→ Code node enriches with suggestion + wiki link
→ Send Email node sends HTML email via SMTP
→ Respond node returns 200 OK
```
---
## n8n Workflow
**Name:** Netdata Enriched Alerts
**URL:** https://n8n.majorshouse.com
**Webhook endpoint:** `POST https://n8n.majorshouse.com/webhook/netdata-alert`
**Workflow ID:** `a1b2c3d4-aaaa-bbbb-cccc-000000000001`
### Nodes
1. **Netdata Webhook** — receives POST from Netdata's `custom_sender()`
2. **Enrich Alert** — Code node; matches alarm/chart/family to enrichment table, builds HTML email body in `$json.emailBody`
3. **Send Enriched Email** — sends via SMTP port 465 (SMTP account 2), from `netdata@majorshouse.com` to `marcus@majorshouse.com`
4. **Respond OK** — returns `ok` with HTTP 200 to Netdata
### Enrichment Keys
The Code node matches on `alarm`, `chart`, or `family` field (case-insensitive substring):
| Key | Title | Wiki Article | Notes |
|-----|-------|-------------|-------|
| `disk_space` | Disk Space Alert | snapraid-mergerfs-setup | |
| `ram` | Memory Alert | managing-linux-services-systemd-ansible | |
| `cpu` | CPU Alert | managing-linux-services-systemd-ansible | |
| `load` | Load Average Alert | managing-linux-services-systemd-ansible | |
| `net` | Network Alert | tailscale-homelab-remote-access | |
| `docker` | Docker Container Alert | debugging-broken-docker-containers | |
| `web_log` | Web Log Alert | tuning-netdata-web-log-alerts | Hostname-aware suggestion (see below) |
| `health` | Docker Health Alarm | netdata-docker-health-alarm-tuning | |
| `mdstat` | RAID Array Alert | mdadm-usb-hub-disconnect-recovery | |
| `systemd` | Systemd Service Alert | docker-caddy-selinux-post-reboot-recovery | |
| _(no match)_ | Server Alert | netdata-new-server-setup | |
> [!info] web_log hostname-aware suggestion (updated 2026-03-24)
> The `web_log` suggestion branches on `hostname` in the Code node:
> - **`majorlab`** → Check `docker logs caddy` (Caddy reverse proxy)
> - **`teelia`, `majorlinux`, `dca`** → Check Apache logs + Fail2ban jail status
> - **other** → Generic web server log guidance
---
## Netdata Configuration
### Config File Locations
| Server | Path |
|--------|------|
| majorhome, majormail, majordiscord, tttpod, teelia | `/etc/netdata/health_alarm_notify.conf` |
| majorlinux, majortoot, dca | `/usr/lib/netdata/conf.d/health_alarm_notify.conf` |
### Required Settings
```bash
DEFAULT_RECIPIENT_CUSTOM="n8n"
role_recipients_custom[sysadmin]="${DEFAULT_RECIPIENT_CUSTOM}"
```
### custom_sender() Function
```bash
custom_sender() {
local to="${1}"
local payload
payload=$(jq -n \
--arg hostname "${host}" \
--arg alarm "${name}" \
--arg chart "${chart}" \
--arg family "${family}" \
--arg status "${status}" \
--arg old_status "${old_status}" \
--arg value "${value_string}" \
--arg units "${units}" \
--arg info "${info}" \
--arg alert_url "${goto_url}" \
--arg severity "${severity}" \
--arg raised_for "${raised_for}" \
--arg total_warnings "${total_warnings}" \
--arg total_critical "${total_critical}" \
'{hostname:$hostname,alarm:$alarm,chart:$chart,family:$family,status:$status,old_status:$old_status,value:$value,units:$units,info:$info,alert_url:$alert_url,severity:$severity,raised_for:$raised_for,total_warnings:$total_warnings,total_critical:$total_critical}')
local httpcode
httpcode=$(docurl -s -o /dev/null -w "%{http_code}" \
-X POST \
-H "Content-Type: application/json" \
-d "${payload}" \
"https://n8n.majorshouse.com/webhook/netdata-alert")
if [ "${httpcode}" = "200" ]; then
info "sent enriched notification to n8n for ${status} of ${host}.${name}"
sent=$((sent + 1))
else
error "failed to send notification to n8n, HTTP code: ${httpcode}"
fi
}
```
!!! note "jq required"
The `custom_sender()` function requires `jq` to be installed. Verify with `which jq` on each server.
---
## Deploying to a New Server
```bash
# 1. Find the config file
find /etc/netdata /usr/lib/netdata -name health_alarm_notify.conf 2>/dev/null
# 2. Edit it — add the two lines and the custom_sender() function above
# 3. Test connectivity from the server
curl -s -o /dev/null -w "%{http_code}" \
-X POST https://n8n.majorshouse.com/webhook/netdata-alert \
-H "Content-Type: application/json" \
-d '{"hostname":"test","alarm":"disk_space._","status":"WARNING"}'
# Expected: 200
# 4. Restart Netdata
systemctl restart netdata
# 5. Send a test alarm
/usr/libexec/netdata/plugins.d/alarm-notify.sh test custom
```
---
## Troubleshooting
**Emails not arriving — check n8n execution log:**
Go to https://n8n.majorshouse.com → open "Netdata Enriched Alerts" → Executions tab. Look for `error` status entries.
**Email body empty:**
The Send Email node's HTML field must be `={{ $json.emailBody }}`. Shell variable expansion can silently strip `$json` if the workflow is patched via inline SSH commands — always use a Python script file.
**`000` curl response from a server:**
Usually a timeout, not a DNS or connection failure. Re-test with `--max-time 30`.
**`custom_sender()` syntax error in Netdata logs:**
Bash heredocs don't work inside sourced config files. Use `jq -n --arg ...` as shown above — no heredocs.
**n8n `N8N_TRUST_PROXY` must be set:**
Without `N8N_TRUST_PROXY=true` in the Docker environment, Caddy's `X-Forwarded-For` header causes n8n's rate limiter to abort requests before parsing the body. Set in `/opt/n8n/compose.yml`.