wiki: add ClamAV safe scheduling article; update Netdata new server setup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-22 03:36:49 -04:00
parent 2830338f6a
commit 335c4b57f2
3 changed files with 125 additions and 7 deletions

View File

@@ -2,17 +2,31 @@
title: "Deploying Netdata to a New Server"
domain: selfhosting
category: monitoring
tags: [netdata, monitoring, email, notifications, netdata-cloud, ubuntu, debian]
tags: [netdata, monitoring, email, notifications, netdata-cloud, ubuntu, debian, n8n]
status: published
created: 2026-03-18
updated: 2026-03-18
updated: 2026-03-22
---
# Deploying Netdata to a New Server
This covers the full Netdata setup for a new server in the fleet: install, email notification config, and Netdata Cloud claim. Applies to Ubuntu/Debian servers.
This covers the full Netdata setup for a new server in the fleet: install, email notification config, n8n webhook integration, and Netdata Cloud claim. Applies to Ubuntu/Debian servers.
## 1. Install
## 1. Install Prerequisites
Install `jq` before anything else. It is required by the `custom_sender()` function in `health_alarm_notify.conf` to build the JSON payload sent to the n8n webhook. **If `jq` is missing, the webhook will fire with an empty body and n8n alert emails will have no information in them.**
```bash
apt install -y jq
```
Verify:
```bash
jq --version
```
## 2. Install Netdata
Use the official kickstart script:
@@ -28,7 +42,7 @@ systemctl is-active netdata
curl -s http://localhost:19999/api/v1/info | python3 -c "import sys,json; d=json.load(sys.stdin); print('Netdata', d['version'])"
```
## 2. Configure Email Notifications
## 3. Configure Email Notifications
Copy the default config and set the three required values:
@@ -64,7 +78,23 @@ You should see three `# OK` lines (WARNING → CRITICAL → CLEAR test cycle) an
> [!note] Delivery via local Postfix
> Email is relayed through the server's local Postfix instance. Ensure Postfix is installed and `/usr/sbin/sendmail` resolves.
## 3. Claim to Netdata Cloud
## 4. Configure n8n Webhook Notifications
Copy the `health_alarm_notify.conf` from an existing server (e.g. majormail) which contains the `custom_sender()` function. This sends enriched JSON payloads to the n8n webhook at `https://n8n.majorshouse.com/webhook/netdata-alert`.
> [!warning] jq required
> The `custom_sender()` function uses `jq` to build the JSON payload. If `jq` is not installed, `payload` will be empty, curl will send `Content-Length: 0`, and n8n will produce alert emails with `Host: unknown`, blank alert/value fields, and `Status: UNKNOWN`. Always install `jq` first (Step 1).
After deploying the config, run a test to confirm the webhook fires correctly:
```bash
systemctl restart netdata
/usr/libexec/netdata/plugins.d/alarm-notify.sh test 2>&1 | grep -E '(custom|n8n|OK|FAILED)'
```
Verify in n8n that the latest execution shows a non-empty body with `hostname`, `alarm`, and `status` fields populated.
## 5. Claim to Netdata Cloud
Get the claim command from **Netdata Cloud → Space Settings → Nodes → Add Nodes**. It will look like:
@@ -84,7 +114,7 @@ cat /var/lib/netdata/cloud.d/claimed_id
A UUID will be present if claimed successfully. The node should appear in Netdata Cloud within ~60 seconds.
## 4. Verify Alerts
## 6. Verify Alerts
Check that no unexpected alerts are active after setup:
@@ -111,6 +141,20 @@ for host in majorlab majorhome majormail majordiscord majortoot majorlinux tttpo
done
```
## Fleet-wide jq Audit
To check that all servers with `custom_sender` have `jq` installed:
```bash
for host in majorlab majorhome majormail majordiscord majortoot majorlinux tttpod dca teelia; do
echo -n "=== $host: "
ssh -o ConnectTimeout=5 root@$host \
'has_cs=$(grep -l "custom_sender\|n8n.majorshouse.com" /etc/netdata/health_alarm_notify.conf 2>/dev/null | wc -l); has_jq=$(which jq 2>/dev/null && echo yes || echo NO); echo "custom_sender=$has_cs jq=$has_jq"'
done
```
Any server showing `custom_sender=1 jq=NO` needs `apt install -y jq` immediately.
## Related
- [Tuning Netdata Web Log Alerts](tuning-netdata-web-log-alerts.md)

View File

@@ -0,0 +1,73 @@
# ClamAV Safe Scheduling on Live Servers
Running `clamscan` unthrottled on a live server will peg CPU until completion. On a small VPS (1 vCPU), a full recursive scan can sustain 70100% CPU for an hour or more, degrading or taking down hosted services.
## The Problem
A common out-of-the-box ClamAV cron setup looks like this:
```cron
0 1 * * 0 clamscan --infected --recursive / --exclude=/sys
```
This runs at Linux's default scheduling priority (`nice 0`) with normal I/O priority. On a live server it will:
- Monopolize the CPU for the scan duration
- Cause high I/O wait, degrading web serving, databases, and other services
- Trigger monitoring alerts (e.g., Netdata `10min_cpu_usage`)
## The Fix
Throttle the scan with `nice` and `ionice`:
```cron
0 1 * * 0 nice -n 19 ionice -c 3 clamscan --infected --recursive / --exclude=/sys
```
| Flag | Meaning |
|------|---------|
| `nice -n 19` | Lowest CPU scheduling priority (range: -20 to 19) |
| `ionice -c 3` | Idle I/O class — only uses disk when no other process needs it |
The scan will take longer but will not impact server performance.
## Applying the Fix
Edit root's crontab:
```bash
crontab -e
```
Or apply non-interactively:
```bash
crontab -l | sed 's|clamscan|nice -n 19 ionice -c 3 clamscan|' | crontab -
```
Verify:
```bash
crontab -l | grep clam
```
## Diagnosing a Runaway Scan
If CPU is already pegged, identify and kill the process:
```bash
ps aux --sort=-%cpu | head -15
# Look for clamscan
kill <PID>
```
## Notes
- `ionice -c 3` (Idle) requires Linux kernel ≥ 2.6.13 and CFQ/BFQ I/O scheduler. Works on most Ubuntu/Debian/Fedora systems.
- On multi-core servers, consider also using `cpulimit` for a hard cap: `cpulimit -l 30 -- clamscan ...`
- Always keep `--exclude=/sys` (and optionally `--exclude=/proc`, `--exclude=/dev`) to avoid scanning virtual filesystems.
## Related
- [ClamAV Documentation](https://docs.clamav.net/)
- [[02-selfhosting/security/linux-server-hardening-checklist|Linux Server Hardening Checklist]]

View File

@@ -53,3 +53,4 @@
* [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md)
* [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md)
* [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md)
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)