wiki: add SELinux AVC chart, enriched alerts, new server setup, and pending articles; update indexes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-27 03:34:33 -04:00
parent 38fe720e63
commit fb2e3f6168
18 changed files with 881 additions and 15 deletions

View File

View File

@@ -154,7 +154,7 @@ alias majorlab='ssh root@100.86.14.126'
alias majormail='ssh root@100.84.165.52' alias majormail='ssh root@100.84.165.52'
alias teelia='ssh root@100.120.32.69' alias teelia='ssh root@100.120.32.69'
alias tttpod='ssh root@100.84.42.102' alias tttpod='ssh root@100.84.42.102'
alias majorrig='ssh -p 2222 majorlinux@100.98.47.29' alias majorrig='ssh majorlinux@100.98.47.29' # port 2222 retired 2026-03-25, fleet uses port 22
# DNF5 # DNF5
alias update='sudo dnf upgrade --refresh' alias update='sudo dnf upgrade --refresh'

View File

@@ -0,0 +1,157 @@
---
title: "Docker Healthchecks"
domain: selfhosting
category: docker
tags: [docker, healthcheck, monitoring, uptime-kuma, compose]
status: published
created: 2026-03-23
updated: 2026-03-23
---
# Docker Healthchecks
A Docker healthcheck tells the daemon (and any monitoring tool) whether a container is actually working — not just running. Without one, a container shows as `Up` even if the app inside is crashed, deadlocked, or waiting on a dependency.
## Why It Matters
Tools like Uptime Kuma report containers without healthchecks as:
> Container has not reported health and is currently running. As it is running, it is considered UP. Consider adding a health check for better service visibility.
A healthcheck upgrades that to a real `(healthy)` or `(unhealthy)` status, making monitoring meaningful.
## Basic Syntax (docker-compose)
```yaml
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
| Field | Description |
|---|---|
| `test` | Command to run. Exit 0 = healthy, non-zero = unhealthy. |
| `interval` | How often to run the check. |
| `timeout` | How long to wait before marking as failed. |
| `retries` | Failures before marking `unhealthy`. |
| `start_period` | Grace period on startup before failures count. |
## Common Patterns
### HTTP service (wget — available in Alpine)
```yaml
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:2368/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
### HTTP service (curl)
```yaml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
### MySQL / MariaDB
```yaml
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-psecret"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20s
```
### PostgreSQL
```yaml
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
```
### Redis
```yaml
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
```
### TCP port check (no curl/wget available)
```yaml
healthcheck:
test: ["CMD-SHELL", "nc -z localhost 8080 || exit 1"]
interval: 30s
timeout: 5s
retries: 3
```
## Using Healthchecks with `depends_on`
Healthchecks enable proper startup ordering. Instead of a fixed sleep, a dependent container waits until its dependency is actually ready:
```yaml
services:
app:
depends_on:
db:
condition: service_healthy
db:
image: mysql:8.0
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20s
```
This prevents the classic race condition where the app starts before the database is ready to accept connections.
## Checking Health Status
```bash
# See health status in container list
docker ps
# Get detailed health info including last check output
docker inspect --format='{{json .State.Health}}' <container> | jq
```
## Ghost Example
Ghost (Alpine-based) uses `wget` rather than `curl`:
```yaml
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:2368/ghost/api/v4/admin/site/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
```
## Gotchas & Notes
- **Alpine images** don't have `curl` by default — use `wget` or install curl in the image.
- **`start_period`** is critical for slow-starting apps (databases, JVM services). Failures during this window don't count toward `retries`.
- **`CMD` vs `CMD-SHELL`** — use `CMD` for direct exec (no shell needed), `CMD-SHELL` when you need pipes, `&&`, or shell builtins.
- **Uptime Kuma** will pick up Docker healthcheck status automatically when monitoring via the Docker socket — no extra config needed.
## See Also
- [[debugging-broken-docker-containers]]
- [[netdata-docker-health-alarm-tuning]]

View File

@@ -24,6 +24,7 @@ Guides for running your own services at home, including Docker, reverse proxies,
- [Tuning Netdata Web Log Alerts](monitoring/tuning-netdata-web-log-alerts.md) - [Tuning Netdata Web Log Alerts](monitoring/tuning-netdata-web-log-alerts.md)
- [Tuning Netdata Docker Health Alarms](monitoring/netdata-docker-health-alarm-tuning.md) - [Tuning Netdata Docker Health Alarms](monitoring/netdata-docker-health-alarm-tuning.md)
- [Deploying Netdata to a New Server](monitoring/netdata-new-server-setup.md)
## Security ## Security

View File

@@ -5,7 +5,7 @@ category: monitoring
tags: [netdata, docker, nextcloud, alarms, health, monitoring] tags: [netdata, docker, nextcloud, alarms, health, monitoring]
status: published status: published
created: 2026-03-18 created: 2026-03-18
updated: 2026-03-18 updated: 2026-03-22
--- ---
# Tuning Netdata Docker Health Alarms to Prevent Update Flapping # Tuning Netdata Docker Health Alarms to Prevent Update Flapping
@@ -40,7 +40,7 @@ component: Docker
every: 30s every: 30s
lookup: average -5m of unhealthy lookup: average -5m of unhealthy
warn: $this > 0 warn: $this > 0
delay: down 5m multiplier 1.5 max 30m delay: up 3m down 5m multiplier 1.5 max 30m
summary: Docker container ${label:container_name} health summary: Docker container ${label:container_name} health
info: ${label:container_name} docker container health status is unhealthy info: ${label:container_name} docker container health status is unhealthy
to: sysadmin to: sysadmin
@@ -49,10 +49,38 @@ component: Docker
| Setting | Default | Tuned | Effect | | Setting | Default | Tuned | Effect |
|---|---|---|---| |---|---|---|---|
| `every` | 10s | 30s | Check less frequently | | `every` | 10s | 30s | Check less frequently |
| `lookup` | average -10s | average -5m | Must be unhealthy for sustained 5 minutes | | `lookup` | average -10s | average -5m | Smooths transient unhealthy samples over 5 minutes |
| `delay` | none | down 5m (max 30m) | Grace period after recovery before clearing | | `delay: up 3m` | none | 3m | Won't fire until unhealthy condition persists for 3 continuous minutes |
| `delay: down 5m` | none | 5m (max 30m) | Grace period after recovery before clearing |
A typical Nextcloud AIO update cycle (3090 seconds of container restarts) won't sustain 5 minutes of unhealthy status, so no alert fires. A genuinely broken container will still be caught. The `up` delay is the critical addition. Nextcloud AIO's `nextcloud-aio-nextcloud` container checks both PostgreSQL (port 5432) and PHP-FPM (port 9000). PHP-FPM takes ~90 seconds to warm up after a restart, causing 23 failing health checks before the container becomes healthy. With `delay: up 3m`, Netdata waits for 3 continuous minutes of unhealthy status before firing — absorbing the ~90 second startup window with margin to spare. A genuinely broken container will still trigger the alert.
## Also: Suppress `docker_container_down` for Normally-Exiting Containers
Nextcloud AIO runs `borgbackup` (scheduled backups) and `watchtower` (auto-updates) as containers that exit with code 0 after completing their work. The stock `docker_container_down` alarm fires on any exited container, generating false alerts after every nightly cycle.
Add a second override to the same file using `chart labels` to exclude them:
```ini
# Suppress docker_container_down for Nextcloud AIO containers that exit normally
# (borgbackup runs on schedule then exits; watchtower does updates then exits)
template: docker_container_down
on: docker.container_running_state
class: Errors
type: Containers
component: Docker
units: status
every: 30s
lookup: average -5m of down
chart labels: container_name=!nextcloud-aio-borgbackup !nextcloud-aio-watchtower *
warn: $this > 0
delay: up 3m down 5m multiplier 1.5 max 30m
summary: Docker container ${label:container_name} down
info: ${label:container_name} docker container is down
to: sysadmin
```
The `chart labels` line uses Netdata's simple pattern syntax — `!` prefix excludes a container, `*` matches everything else. All other exited containers still alert normally.
## Applying the Config ## Applying the Config
@@ -74,7 +102,7 @@ In the Netdata UI, navigate to **Alerts → Manage Alerts** and search for `dock
## Notes ## Notes
- This only overrides the `docker_container_unhealthy` alarm. The `docker_container_down` alarm (for exited containers) is left at its default — it already has a `delay: down 1m` and is disabled by default (`chart labels: container_name=!*`). - Both `docker_container_unhealthy` and `docker_container_down` are overridden in this config. Any container not explicitly excluded in the `chart labels` filter will still alert normally.
- If you want per-container silencing instead of a blanket delay, use the `host labels` or `chart labels` filter to scope the alarm to specific containers. - If you want per-container silencing instead of a blanket delay, use the `host labels` or `chart labels` filter to scope the alarm to specific containers.
- Config volume path on majorlab: `/var/lib/docker/volumes/netdata_netdataconfig/_data/` - Config volume path on majorlab: `/var/lib/docker/volumes/netdata_netdataconfig/_data/`

View File

@@ -0,0 +1,159 @@
# Netdata → n8n Enriched Alert Emails
**Status:** Live across all MajorsHouse fleet servers as of 2026-03-21
Replaces Netdata's plain-text alert emails with rich HTML emails that include a plain-English explanation, a suggested remediation command, and a direct link to the relevant MajorWiki article.
---
## How It Works
```
Netdata alarm fires
→ custom_sender() in health_alarm_notify.conf
→ POST JSON payload to n8n webhook
→ Code node enriches with suggestion + wiki link
→ Send Email node sends HTML email via SMTP
→ Respond node returns 200 OK
```
---
## n8n Workflow
**Name:** Netdata Enriched Alerts
**URL:** https://n8n.majorshouse.com
**Webhook endpoint:** `POST https://n8n.majorshouse.com/webhook/netdata-alert`
**Workflow ID:** `a1b2c3d4-aaaa-bbbb-cccc-000000000001`
### Nodes
1. **Netdata Webhook** — receives POST from Netdata's `custom_sender()`
2. **Enrich Alert** — Code node; matches alarm/chart/family to enrichment table, builds HTML email body in `$json.emailBody`
3. **Send Enriched Email** — sends via SMTP port 465 (SMTP account 2), from `netdata@majorshouse.com` to `marcus@majorshouse.com`
4. **Respond OK** — returns `ok` with HTTP 200 to Netdata
### Enrichment Keys
The Code node matches on `alarm`, `chart`, or `family` field (case-insensitive substring):
| Key | Title | Wiki Article | Notes |
|-----|-------|-------------|-------|
| `disk_space` | Disk Space Alert | snapraid-mergerfs-setup | |
| `ram` | Memory Alert | managing-linux-services-systemd-ansible | |
| `cpu` | CPU Alert | managing-linux-services-systemd-ansible | |
| `load` | Load Average Alert | managing-linux-services-systemd-ansible | |
| `net` | Network Alert | tailscale-homelab-remote-access | |
| `docker` | Docker Container Alert | debugging-broken-docker-containers | |
| `web_log` | Web Log Alert | tuning-netdata-web-log-alerts | Hostname-aware suggestion (see below) |
| `health` | Docker Health Alarm | netdata-docker-health-alarm-tuning | |
| `mdstat` | RAID Array Alert | mdadm-usb-hub-disconnect-recovery | |
| `systemd` | Systemd Service Alert | docker-caddy-selinux-post-reboot-recovery | |
| _(no match)_ | Server Alert | netdata-new-server-setup | |
> [!info] web_log hostname-aware suggestion (updated 2026-03-24)
> The `web_log` suggestion branches on `hostname` in the Code node:
> - **`majorlab`** → Check `docker logs caddy` (Caddy reverse proxy)
> - **`teelia`, `majorlinux`, `dca`** → Check Apache logs + Fail2ban jail status
> - **other** → Generic web server log guidance
---
## Netdata Configuration
### Config File Locations
| Server | Path |
|--------|------|
| majorhome, majormail, majordiscord, tttpod, teelia | `/etc/netdata/health_alarm_notify.conf` |
| majorlinux, majortoot, dca | `/usr/lib/netdata/conf.d/health_alarm_notify.conf` |
### Required Settings
```bash
DEFAULT_RECIPIENT_CUSTOM="n8n"
role_recipients_custom[sysadmin]="${DEFAULT_RECIPIENT_CUSTOM}"
```
### custom_sender() Function
```bash
custom_sender() {
local to="${1}"
local payload
payload=$(jq -n \
--arg hostname "${host}" \
--arg alarm "${name}" \
--arg chart "${chart}" \
--arg family "${family}" \
--arg status "${status}" \
--arg old_status "${old_status}" \
--arg value "${value_string}" \
--arg units "${units}" \
--arg info "${info}" \
--arg alert_url "${goto_url}" \
--arg severity "${severity}" \
--arg raised_for "${raised_for}" \
--arg total_warnings "${total_warnings}" \
--arg total_critical "${total_critical}" \
'{hostname:$hostname,alarm:$alarm,chart:$chart,family:$family,status:$status,old_status:$old_status,value:$value,units:$units,info:$info,alert_url:$alert_url,severity:$severity,raised_for:$raised_for,total_warnings:$total_warnings,total_critical:$total_critical}')
local httpcode
httpcode=$(docurl -s -o /dev/null -w "%{http_code}" \
-X POST \
-H "Content-Type: application/json" \
-d "${payload}" \
"https://n8n.majorshouse.com/webhook/netdata-alert")
if [ "${httpcode}" = "200" ]; then
info "sent enriched notification to n8n for ${status} of ${host}.${name}"
sent=$((sent + 1))
else
error "failed to send notification to n8n, HTTP code: ${httpcode}"
fi
}
```
!!! note "jq required"
The `custom_sender()` function requires `jq` to be installed. Verify with `which jq` on each server.
---
## Deploying to a New Server
```bash
# 1. Find the config file
find /etc/netdata /usr/lib/netdata -name health_alarm_notify.conf 2>/dev/null
# 2. Edit it — add the two lines and the custom_sender() function above
# 3. Test connectivity from the server
curl -s -o /dev/null -w "%{http_code}" \
-X POST https://n8n.majorshouse.com/webhook/netdata-alert \
-H "Content-Type: application/json" \
-d '{"hostname":"test","alarm":"disk_space._","status":"WARNING"}'
# Expected: 200
# 4. Restart Netdata
systemctl restart netdata
# 5. Send a test alarm
/usr/libexec/netdata/plugins.d/alarm-notify.sh test custom
```
---
## Troubleshooting
**Emails not arriving — check n8n execution log:**
Go to https://n8n.majorshouse.com → open "Netdata Enriched Alerts" → Executions tab. Look for `error` status entries.
**Email body empty:**
The Send Email node's HTML field must be `={{ $json.emailBody }}`. Shell variable expansion can silently strip `$json` if the workflow is patched via inline SSH commands — always use a Python script file.
**`000` curl response from a server:**
Usually a timeout, not a DNS or connection failure. Re-test with `--max-time 30`.
**`custom_sender()` syntax error in Netdata logs:**
Bash heredocs don't work inside sourced config files. Use `jq -n --arg ...` as shown above — no heredocs.
**n8n `N8N_TRUST_PROXY` must be set:**
Without `N8N_TRUST_PROXY=true` in the Docker environment, Caddy's `X-Forwarded-For` header causes n8n's rate limiter to abort requests before parsing the body. Set in `/opt/n8n/compose.yml`.

View File

@@ -0,0 +1,161 @@
---
title: "Deploying Netdata to a New Server"
domain: selfhosting
category: monitoring
tags: [netdata, monitoring, email, notifications, netdata-cloud, ubuntu, debian, n8n]
status: published
created: 2026-03-18
updated: 2026-03-22
---
# Deploying Netdata to a New Server
This covers the full Netdata setup for a new server in the fleet: install, email notification config, n8n webhook integration, and Netdata Cloud claim. Applies to Ubuntu/Debian servers.
## 1. Install Prerequisites
Install `jq` before anything else. It is required by the `custom_sender()` function in `health_alarm_notify.conf` to build the JSON payload sent to the n8n webhook. **If `jq` is missing, the webhook will fire with an empty body and n8n alert emails will have no information in them.**
```bash
apt install -y jq
```
Verify:
```bash
jq --version
```
## 2. Install Netdata
Use the official kickstart script:
```bash
wget -O /tmp/netdata-install.sh https://get.netdata.cloud/kickstart.sh
sh /tmp/netdata-install.sh --non-interactive --stable-channel --disable-telemetry
```
Verify it's running:
```bash
systemctl is-active netdata
curl -s http://localhost:19999/api/v1/info | python3 -c "import sys,json; d=json.load(sys.stdin); print('Netdata', d['version'])"
```
## 3. Configure Email Notifications
Copy the default config and set the three required values:
```bash
cp /usr/lib/netdata/conf.d/health_alarm_notify.conf /etc/netdata/health_alarm_notify.conf
```
Edit `/etc/netdata/health_alarm_notify.conf`:
```ini
EMAIL_SENDER="netdata@majorshouse.com"
SEND_EMAIL="YES"
DEFAULT_RECIPIENT_EMAIL="marcus@majorshouse.com"
```
Or apply with `sed` in one shot:
```bash
sed -i 's/^#\?EMAIL_SENDER=.*/EMAIL_SENDER="netdata@majorshouse.com"/' /etc/netdata/health_alarm_notify.conf
sed -i 's/^#\?SEND_EMAIL=.*/SEND_EMAIL="YES"/' /etc/netdata/health_alarm_notify.conf
sed -i 's/^#\?DEFAULT_RECIPIENT_EMAIL=.*/DEFAULT_RECIPIENT_EMAIL="marcus@majorshouse.com"/' /etc/netdata/health_alarm_notify.conf
```
Restart and test:
```bash
systemctl restart netdata
/usr/libexec/netdata/plugins.d/alarm-notify.sh test 2>&1 | grep -E '(OK|FAILED|email)'
```
You should see three `# OK` lines (WARNING → CRITICAL → CLEAR test cycle) and confirmation that email was sent to `marcus@majorshouse.com`.
> [!note] Delivery via local Postfix
> Email is relayed through the server's local Postfix instance. Ensure Postfix is installed and `/usr/sbin/sendmail` resolves.
## 4. Configure n8n Webhook Notifications
Copy the `health_alarm_notify.conf` from an existing server (e.g. majormail) which contains the `custom_sender()` function. This sends enriched JSON payloads to the n8n webhook at `https://n8n.majorshouse.com/webhook/netdata-alert`.
> [!warning] jq required
> The `custom_sender()` function uses `jq` to build the JSON payload. If `jq` is not installed, `payload` will be empty, curl will send `Content-Length: 0`, and n8n will produce alert emails with `Host: unknown`, blank alert/value fields, and `Status: UNKNOWN`. Always install `jq` first (Step 1).
After deploying the config, run a test to confirm the webhook fires correctly:
```bash
systemctl restart netdata
/usr/libexec/netdata/plugins.d/alarm-notify.sh test 2>&1 | grep -E '(custom|n8n|OK|FAILED)'
```
Verify in n8n that the latest execution shows a non-empty body with `hostname`, `alarm`, and `status` fields populated.
## 5. Claim to Netdata Cloud
Get the claim command from **Netdata Cloud → Space Settings → Nodes → Add Nodes**. It will look like:
```bash
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh
sh /tmp/netdata-kickstart.sh --stable-channel \
--claim-token <token> \
--claim-rooms <room-id> \
--claim-url https://app.netdata.cloud
```
Verify the claim was accepted:
```bash
cat /var/lib/netdata/cloud.d/claimed_id
```
A UUID will be present if claimed successfully. The node should appear in Netdata Cloud within ~60 seconds.
## 6. Verify Alerts
Check that no unexpected alerts are active after setup:
```bash
curl -s 'http://localhost:19999/api/v1/alarms?active' | python3 -c "
import sys, json
d = json.load(sys.stdin)
active = [v for v in d.get('alarms', {}).values() if v.get('status') not in ('CLEAR', 'UNINITIALIZED', 'UNDEFINED')]
print(f'{len(active)} active alert(s)')
for v in active:
print(f' [{v[\"status\"]}] {v[\"name\"]} on {v[\"chart\"]}')
"
```
## Fleet-wide Alert Check
To audit all servers at once (requires Tailscale SSH access):
```bash
for host in majorlab majorhome majormail majordiscord majortoot majorlinux tttpod dca teelia; do
echo "=== $host ==="
ssh root@$host "curl -s 'http://localhost:19999/api/v1/alarms?active' | python3 -c \
\"import sys,json; d=json.load(sys.stdin); active=[v for v in d.get('alarms',{}).values() if v.get('status') not in ('CLEAR','UNINITIALIZED','UNDEFINED')]; print(str(len(active))+' active')\""
done
```
## Fleet-wide jq Audit
To check that all servers with `custom_sender` have `jq` installed:
```bash
for host in majorlab majorhome majormail majordiscord majortoot majorlinux tttpod dca teelia; do
echo -n "=== $host: "
ssh -o ConnectTimeout=5 root@$host \
'has_cs=$(grep -l "custom_sender\|n8n.majorshouse.com" /etc/netdata/health_alarm_notify.conf 2>/dev/null | wc -l); has_jq=$(which jq 2>/dev/null && echo yes || echo NO); echo "custom_sender=$has_cs jq=$has_jq"'
done
```
Any server showing `custom_sender=1 jq=NO` needs `apt install -y jq` immediately.
## Related
- [Tuning Netdata Web Log Alerts](tuning-netdata-web-log-alerts.md)
- [Tuning Netdata Docker Health Alarms](netdata-docker-health-alarm-tuning.md)

View File

@@ -0,0 +1,137 @@
---
title: "Netdata SELinux AVC Denial Monitoring"
domain: selfhosting
category: monitoring
tags: [netdata, selinux, fedora, monitoring, ausearch, charts.d]
status: published
created: 2026-03-27
updated: 2026-03-27
---
# Netdata SELinux AVC Denial Monitoring
A custom `charts.d` plugin that tracks SELinux AVC denials over time via Netdata. Deployed on all Fedora boxes in the fleet where SELinux is Enforcing.
## What It Does
The plugin runs `ausearch -m avc` every 60 seconds and reports the count of AVC denial events from the last 10 minutes. This gives a real-time chart in Netdata Cloud showing SELinux denial spikes — useful for catching misconfigurations after service changes or package updates.
## Where It's Deployed
| Host | OS | SELinux | Chart Installed |
|------|----|---------|-----------------|
| majorhome | Fedora 43 | Enforcing | Yes |
| majorlab | Fedora 43 | Enforcing | Yes |
| majormail | Fedora 43 | Enforcing | Yes |
| majordiscord | Fedora 43 | Enforcing | Yes |
Ubuntu hosts (dca, teelia, tttpod, majortoot, majorlinux) do not run SELinux and do not have this chart.
## Installation
### 1. Create the Chart Plugin
Create `/etc/netdata/charts.d/selinux.chart.sh`:
```bash
cat > /etc/netdata/charts.d/selinux.chart.sh << 'EOF'
# SELinux AVC denial counter for Netdata charts.d
selinux_update_every=60
selinux_priority=90000
selinux_check() {
which ausearch >/dev/null 2>&1 || return 1
return 0
}
selinux_create() {
cat <<CHART
CHART selinux.avc_denials '' 'SELinux AVC Denials (last 10 min)' 'denials' selinux '' line 90000 $selinux_update_every ''
DIMENSION denials '' absolute 1 1
CHART
return 0
}
selinux_update() {
local count
count=$(sudo /usr/bin/ausearch -m avc -if /var/log/audit/audit.log -ts recent 2>/dev/null | grep -c "type=AVC")
echo "BEGIN selinux.avc_denials $1"
echo "SET denials = ${count}"
echo "END"
return 0
}
EOF
```
### 2. Grant Netdata Sudo Access to ausearch
`ausearch` requires root to read the audit log. Add a sudoers entry for the `netdata` user:
```bash
echo 'netdata ALL=(root) NOPASSWD: /usr/bin/ausearch -m avc -if /var/log/audit/audit.log -ts recent' > /etc/sudoers.d/netdata-selinux
chmod 440 /etc/sudoers.d/netdata-selinux
visudo -c
```
The `visudo -c` validates syntax. If it reports errors, fix the file before proceeding — a broken sudoers file can lock out sudo entirely.
### 3. Restart Netdata
```bash
systemctl restart netdata
```
### 4. Verify
Check that the chart is collecting data:
```bash
curl -s 'http://localhost:19999/api/v1/chart?chart=selinux.avc_denials' | python3 -c "
import sys, json
d = json.load(sys.stdin)
print(f'Chart: {d[\"id\"]}')
print(f'Update every: {d[\"update_every\"]}s')
print(f'Type: {d[\"chart_type\"]}')
"
```
If the chart doesn't appear, check that `charts.d` is enabled in `/etc/netdata/netdata.conf` and that the plugin file is readable by the `netdata` user.
## Known Side Effect: pam_systemd Log Noise
Because the `netdata` user calls `sudo ausearch` every 60 seconds, `pam_systemd` logs a warning each time:
```
pam_systemd(sudo:session): Failed to check if /run/user/0/bus exists, ignoring: Permission denied
```
This is cosmetic. The `sudo` command succeeds — `pam_systemd` just can't find a D-Bus user session for the `netdata` service account, which is expected. The message volume scales with the collection interval (1,440/day at 60-second intervals).
**To suppress it**, the `system-auth` PAM config on Fedora already marks `pam_systemd.so` as `-session optional` (the `-` prefix means "don't fail if the module errors"). The messages are informational log noise, not actual failures. No PAM changes are needed.
If the log volume is a concern for log analysis or monitoring, filter it at the journald level:
```ini
# /etc/rsyslog.d/suppress-pam-systemd.conf
:msg, contains, "pam_systemd(sudo:session): Failed to check" stop
```
Or in Netdata's log alert config, exclude the pattern from any log-based alerts.
## Fleet Audit
To verify the chart is deployed and functioning on all Fedora hosts:
```bash
for host in majorhome majorlab majormail majordiscord; do
echo -n "=== $host: "
ssh root@$host "curl -s 'http://localhost:19999/api/v1/chart?chart=selinux.avc_denials' 2>/dev/null | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d[\"id\"], \"every\", str(d[\"update_every\"])+\"s\")' 2>/dev/null || echo 'NOT FOUND'"
done
```
## Related
- [Deploying Netdata to a New Server](netdata-new-server-setup.md)
- [Tuning Netdata Web Log Alerts](tuning-netdata-web-log-alerts.md)
- [Tuning Netdata Docker Health Alarms](netdata-docker-health-alarm-tuning.md)
- [SELinux: Fixing Dovecot Mail Spool Context](/05-troubleshooting/selinux-dovecot-vmail-context.md)

View File

@@ -0,0 +1,59 @@
# Ansible: Vault Password File Not Found
## Error
```
[WARNING]: Error getting vault password file (default): The vault password file /Users/majorlinux/.ansible/vault_pass was not found
[ERROR]: The vault password file /Users/majorlinux/.ansible/vault_pass was not found
```
## Cause
Ansible is configured to look for a vault password file at `~/.ansible/vault_pass`, but the file does not exist. This is typically set in `ansible.cfg` via the `vault_password_file` directive.
## Solutions
### Option 1: Remove the vault config (if you're not using Vault)
Check your `ansible.cfg` for this line and remove it if Vault is not needed:
```ini
[defaults]
vault_password_file = ~/.ansible/vault_pass
```
### Option 2: Create the vault password file
```bash
echo 'your_vault_password' > ~/.ansible/vault_pass
chmod 600 ~/.ansible/vault_pass
```
> **Security note:** Keep permissions tight (`600`) so only your user can read the file. The actual vault password is stored in Bitwarden under the "Ansible Vault Password" entry.
### Option 3: Pass the password at runtime (no file needed)
```bash
ansible-playbook test.yml --ask-vault-pass
```
## Diagnosing the Source of the Config
To find which config file is setting `vault_password_file`, run:
```bash
ansible-config dump --only-changed
```
This shows all non-default config values and their source files. Config is loaded in this order of precedence:
1. `ANSIBLE_CONFIG` environment variable
2. `./ansible.cfg` (current directory)
3. `~/.ansible.cfg`
4. `/etc/ansible/ansible.cfg`
## Related
- [Ansible Getting Started](../01-linux/shell-scripting/ansible-getting-started.md)
- Vault password is stored in Bitwarden under **"Ansible Vault Password"**
- Ansible playbooks live at `~/MajorAnsible` on MajorAir/MajorMac

View File

@@ -9,6 +9,7 @@ Practical fixes for common Linux, networking, and application problems.
- [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md) - [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](networking/fail2ban-self-ban-apache-outage.md)
- [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](networking/fail2ban-imap-self-ban-mail-client.md) - [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](networking/fail2ban-imap-self-ban-mail-client.md)
- [firewalld: Mail Ports Wiped After Reload](networking/firewalld-mail-ports-reset.md) - [firewalld: Mail Ports Wiped After Reload](networking/firewalld-mail-ports-reset.md)
- [Tailscale SSH: Unexpected Re-Authentication Prompt](networking/tailscale-ssh-reauth-prompt.md)
- [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md) - [ISP SNI Filtering & Caddy](isp-sni-filtering-caddy.md)
- [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md) - [yt-dlp YouTube JS Challenge Fix](yt-dlp-fedora-js-challenge.md)

View File

@@ -0,0 +1,66 @@
# Tailscale SSH: Unexpected Re-Authentication Prompt
If a Tailscale SSH connection unexpectedly presents a browser authentication URL mid-session, the first instinct is to check the ACL policy. However, this is often a one-off Tailscale hiccup rather than a misconfiguration.
## Symptoms
- SSH connection to a fleet node displays a Tailscale auth URL:
```
To authenticate, visit: https://login.tailscale.com/a/xxxxxxxx
```
- The prompt appears even though the node worked fine previously
- Other nodes in the fleet connect without prompting
## What Causes It
Tailscale SSH supports two ACL `action` values:
| Action | Behavior |
|---|---|
| `accept` | Trusts Tailscale identity — no additional auth required |
| `check` | Requires periodic browser-based re-authentication |
If `action: "check"` is set, every session (or after token expiry) will prompt for browser auth. However, even with `action: "accept"`, a one-off prompt can appear due to a Tailscale daemon glitch or key refresh event.
## How to Diagnose
### 1. Verify the ACL policy
In the Tailscale admin console (or via `tailscale debug acl`), inspect the SSH rules. For a trusted homelab fleet, the rule should use `accept`:
```json
{
"src": ["autogroup:member"],
"dst": ["autogroup:self"],
"users": ["autogroup:nonroot", "root"],
"action": "accept",
}
```
If `action` is `check`, that is the root cause — change it to `accept` for trusted source/destination pairs.
### 2. Confirm it was a one-off
If the ACL already shows `accept`, the prompt was transient. Test with:
```bash
ssh <hostname> "echo ok"
```
No auth prompt + `ok` output = resolved. Note that this test is only meaningful if the previous session's auth token has expired, or you test from a different device that hasn't recently authenticated.
## Fix
**If ACL shows `check`:** Change to `accept` in the Tailscale admin console under Access Controls. Takes effect immediately — no server changes needed.
**If ACL already shows `accept`:** No action required. The prompt was a one-off Tailscale event (daemon restart, key refresh, etc.). Monitor for recurrence.
## Notes
- ~~Port 2222 on **MajorRig** previously existed as a hard bypass for Tailscale SSH browser auth. This workaround was retired on 2026-03-25 after the Tailscale SSH authentication issue was resolved. The entire fleet now uses port 22 uniformly.~~
- The `autogroup:self` destination means the rule applies when connecting from your own devices to your own devices — appropriate for a personal homelab fleet.
## Related
- [[Network Overview]] — Tailscale fleet inventory and SSH access model
- [[SSH-Aliases]] — Fleet SSH access shortcuts

View File

@@ -48,7 +48,7 @@ The Windows OpenSSH Server is installed as a Windows Feature (`Add-WindowsCapabi
- **This is a Windows-side issue** — WSL2 itself is unaffected. The service must be started and configured from Windows, not from within WSL2. - **This is a Windows-side issue** — WSL2 itself is unaffected. The service must be started and configured from Windows, not from within WSL2.
- **Elevated PowerShell required** — `Start-Service` and `Set-Service` for sshd will return "Access is denied" if run without Administrator privileges. - **Elevated PowerShell required** — `Start-Service` and `Set-Service` for sshd will return "Access is denied" if run without Administrator privileges.
- **Port 2222 is also affected** — both the standard port 22 and the bypass port 2222 on MajorRig are served by the same `sshd` service. - **Port 2222 was retired (2026-03-25)** — the bypass port 2222 on MajorRig is no longer in use. The entire fleet now uses port 22 uniformly after the Tailscale SSH auth fix. Only port 22 needs to be verified when troubleshooting sshd.
- **Default shell still works once fixed** — MajorRig's sshd is configured to use `C:\Windows\System32\wsl.exe` as the default shell, dropping SSH sessions directly into WSL2/Bash. This config is preserved across service restarts. - **Default shell still works once fixed** — MajorRig's sshd is configured to use `C:\Windows\System32\wsl.exe` as the default shell, dropping SSH sessions directly into WSL2/Bash. This config is preserved across service restarts.
--- ---

View File

@@ -0,0 +1,73 @@
# ClamAV Safe Scheduling on Live Servers
Running `clamscan` unthrottled on a live server will peg CPU until completion. On a small VPS (1 vCPU), a full recursive scan can sustain 70100% CPU for an hour or more, degrading or taking down hosted services.
## The Problem
A common out-of-the-box ClamAV cron setup looks like this:
```cron
0 1 * * 0 clamscan --infected --recursive / --exclude=/sys
```
This runs at Linux's default scheduling priority (`nice 0`) with normal I/O priority. On a live server it will:
- Monopolize the CPU for the scan duration
- Cause high I/O wait, degrading web serving, databases, and other services
- Trigger monitoring alerts (e.g., Netdata `10min_cpu_usage`)
## The Fix
Throttle the scan with `nice` and `ionice`:
```cron
0 1 * * 0 nice -n 19 ionice -c 3 clamscan --infected --recursive / --exclude=/sys
```
| Flag | Meaning |
|------|---------|
| `nice -n 19` | Lowest CPU scheduling priority (range: -20 to 19) |
| `ionice -c 3` | Idle I/O class — only uses disk when no other process needs it |
The scan will take longer but will not impact server performance.
## Applying the Fix
Edit root's crontab:
```bash
crontab -e
```
Or apply non-interactively:
```bash
crontab -l | sed 's|clamscan|nice -n 19 ionice -c 3 clamscan|' | crontab -
```
Verify:
```bash
crontab -l | grep clam
```
## Diagnosing a Runaway Scan
If CPU is already pegged, identify and kill the process:
```bash
ps aux --sort=-%cpu | head -15
# Look for clamscan
kill <PID>
```
## Notes
- `ionice -c 3` (Idle) requires Linux kernel ≥ 2.6.13 and CFQ/BFQ I/O scheduler. Works on most Ubuntu/Debian/Fedora systems.
- On multi-core servers, consider also using `cpulimit` for a hard cap: `cpulimit -l 30 -- clamscan ...`
- Always keep `--exclude=/sys` (and optionally `--exclude=/proc`, `--exclude=/dev`) to avoid scanning virtual filesystems.
## Related
- [ClamAV Documentation](https://docs.clamav.net/)
- [[02-selfhosting/security/linux-server-hardening-checklist|Linux Server Hardening Checklist]]

View File

@@ -128,7 +128,7 @@ Every time a new article is added, the following **MUST** be updated to maintain
**Updated:** `updated: 2026-03-17` **Updated:** `updated: 2026-03-17`
## Session Update — 2026-03-18 ## Session Update — 2026-03-18 (morning)
**Article count:** 48 (was 47) **Article count:** 48 (was 47)
@@ -136,3 +136,12 @@ Every time a new article is added, the following **MUST** be updated to maintain
- `02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md` — tuning docker_container_unhealthy alarm to prevent flapping during Nextcloud AIO updates - `02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md` — tuning docker_container_unhealthy alarm to prevent flapping during Nextcloud AIO updates
**Updated:** `updated: 2026-03-18` **Updated:** `updated: 2026-03-18`
## Session Update — 2026-03-18 (afternoon)
**Article count:** 49 (was 48)
**New articles added:**
- `02-selfhosting/monitoring/netdata-new-server-setup.md` — full Netdata deployment guide: install via kickstart.sh, email notification config, Netdata Cloud claim
**Updated:** `updated: 2026-03-18`

View File

@@ -3,14 +3,14 @@
> A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin. > A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin.
> >
**Last updated:** 2026-03-18 **Last updated:** 2026-03-18
**Article count:** 48 **Article count:** 49
## Domains ## Domains
| Domain | Folder | Articles | | Domain | Folder | Articles |
|---|---|---| |---|---|---|
| 🐧 Linux & Sysadmin | `01-linux/` | 11 | | 🐧 Linux & Sysadmin | `01-linux/` | 11 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 10 | | 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 11 |
| 🔓 Open Source Tools | `03-opensource/` | 9 | | 🔓 Open Source Tools | `03-opensource/` | 9 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 2 | | 🎙️ Streaming & Podcasting | `04-streaming/` | 2 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 16 | | 🔧 General Troubleshooting | `05-troubleshooting/` | 16 |
@@ -65,6 +65,7 @@
### Monitoring ### Monitoring
- [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) — tuning web_log_1m_redirects threshold for HTTPS-forcing servers - [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) — tuning web_log_1m_redirects threshold for HTTPS-forcing servers
- [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) — preventing false alerts during nightly Nextcloud AIO container update cycles - [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) — preventing false alerts during nightly Nextcloud AIO container update cycles
- [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) — install, email notifications, and Netdata Cloud claim for Ubuntu/Debian servers
### Security ### Security
- [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban - [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban
@@ -129,6 +130,7 @@
| Date | Article | Domain | | Date | Article | Domain |
|---|---|---| |---|---|---|
| 2026-03-18 | [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) | Self-Hosting |
| 2026-03-18 | [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) | Self-Hosting | | 2026-03-18 | [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) | Self-Hosting |
| 2026-03-17 | [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) | Troubleshooting | | 2026-03-17 | [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) | Troubleshooting |
| 2026-03-17 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting | | 2026-03-17 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting |

View File

@@ -15,11 +15,13 @@
* [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md) * [Self-Hosting Starter Guide](02-selfhosting/docker/self-hosting-starter-guide.md)
* [Docker vs VMs for the Homelab](02-selfhosting/docker/docker-vs-vms-homelab.md) * [Docker vs VMs for the Homelab](02-selfhosting/docker/docker-vs-vms-homelab.md)
* [Debugging Broken Docker Containers](02-selfhosting/docker/debugging-broken-docker-containers.md) * [Debugging Broken Docker Containers](02-selfhosting/docker/debugging-broken-docker-containers.md)
* [Docker Healthchecks](02-selfhosting/docker/docker-healthchecks.md)
* [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md) * [Setting Up Caddy as a Reverse Proxy](02-selfhosting/reverse-proxy/setting-up-caddy-reverse-proxy.md)
* [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md) * [Tailscale for Homelab Remote Access](02-selfhosting/dns-networking/tailscale-homelab-remote-access.md)
* [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md) * [rsync Backup Patterns](02-selfhosting/storage-backup/rsync-backup-patterns.md)
* [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) * [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md)
* [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) * [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md)
* [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md)
* [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) * [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md)
* [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md) * [Standardizing unattended-upgrades with Ansible](02-selfhosting/security/ansible-unattended-upgrades-fleet.md)
* [Open Source & Alternatives](03-opensource/index.md) * [Open Source & Alternatives](03-opensource/index.md)
@@ -39,6 +41,7 @@
* [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md) * [Apache Outage: Fail2ban Self-Ban + Missing iptables Rules](05-troubleshooting/networking/fail2ban-self-ban-apache-outage.md)
* [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md) * [Mail Client Stops Receiving: Fail2ban IMAP Self-Ban](05-troubleshooting/networking/fail2ban-imap-self-ban-mail-client.md)
* [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md) * [firewalld: Mail Ports Wiped After Reload](05-troubleshooting/networking/firewalld-mail-ports-reset.md)
* [Tailscale SSH: Unexpected Re-Authentication Prompt](05-troubleshooting/networking/tailscale-ssh-reauth-prompt.md)
* [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md) * [Docker & Caddy Recovery After Reboot (Fedora + SELinux)](05-troubleshooting/docker-caddy-selinux-post-reboot-recovery.md)
* [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md) * [ISP SNI Filtering with Caddy](05-troubleshooting/isp-sni-filtering-caddy.md)
* [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md) * [Obsidian Vault Recovery — Loading Cache Hang](05-troubleshooting/obsidian-cache-hang-recovery.md)
@@ -51,3 +54,6 @@
* [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) * [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md)
* [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) * [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md)
* [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) * [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md)
* [ClamAV CPU Spike: Safe Scheduling with nice/ionice](05-troubleshooting/security/clamscan-cpu-spike-nice-ionice.md)
* [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md)

View File

@@ -2,18 +2,20 @@
> A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin. > A growing reference of Linux, self-hosting, open source, streaming, and troubleshooting guides. Written by MajorLinux. Used by MajorTwin.
> >
> **Last updated:** 2026-03-18 > **Last updated:** 2026-03-23
> **Article count:** 48 > **Article count:** 50
## Domains ## Domains
| Domain | Folder | Articles | | Domain | Folder | Articles |
|---|---|---| |---|---|---|
| 🐧 Linux & Sysadmin | `01-linux/` | 11 | | 🐧 Linux & Sysadmin | `01-linux/` | 11 |
| 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 10 | | 🏠 Self-Hosting & Homelab | `02-selfhosting/` | 11 |
| 🔓 Open Source Tools | `03-opensource/` | 9 | | 🔓 Open Source Tools | `03-opensource/` | 9 |
| 🎙️ Streaming & Podcasting | `04-streaming/` | 2 | | 🎙️ Streaming & Podcasting | `04-streaming/` | 2 |
| 🔧 General Troubleshooting | `05-troubleshooting/` | 16 | | 🔧 General Troubleshooting | `05-troubleshooting/` | 17 |
--- ---
@@ -65,6 +67,7 @@
### Monitoring ### Monitoring
- [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) — tuning web_log_1m_redirects threshold for HTTPS-forcing servers - [Tuning Netdata Web Log Alerts](02-selfhosting/monitoring/tuning-netdata-web-log-alerts.md) — tuning web_log_1m_redirects threshold for HTTPS-forcing servers
- [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) — preventing false alerts during nightly Nextcloud AIO container update cycles - [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) — preventing false alerts during nightly Nextcloud AIO container update cycles
- [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) — install, email notifications, and Netdata Cloud claim for Ubuntu/Debian servers
### Security ### Security
- [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban - [Linux Server Hardening Checklist](02-selfhosting/security/linux-server-hardening-checklist.md) — non-root user, SSH key auth, sshd_config, firewall, fail2ban
@@ -122,6 +125,8 @@
- [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) — diagnosing and recovering a failed mdadm array caused by a USB hub dropout - [mdadm RAID Recovery After USB Hub Disconnect](05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md) — diagnosing and recovering a failed mdadm array caused by a USB hub dropout
- [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) — fixing sshd not running after reboot due to Manual startup type - [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) — fixing sshd not running after reboot due to Manual startup type
- [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) — keeping Ollama reachable over Tailscale by disabling macOS sleep on AC power - [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) — keeping Ollama reachable over Tailscale by disabling macOS sleep on AC power
- [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md) — fixing the missing vault_pass file error when running ansible-playbook
--- ---
@@ -129,6 +134,8 @@
| Date | Article | Domain | | Date | Article | Domain |
|---|---|---| |---|---|---|
| 2026-03-23 | [Ansible: Vault Password File Not Found](05-troubleshooting/ansible-vault-password-file-missing.md) | Troubleshooting |
| 2026-03-18 | [Deploying Netdata to a New Server](02-selfhosting/monitoring/netdata-new-server-setup.md) | Self-Hosting |
| 2026-03-18 | [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) | Self-Hosting | | 2026-03-18 | [Tuning Netdata Docker Health Alarms](02-selfhosting/monitoring/netdata-docker-health-alarm-tuning.md) | Self-Hosting |
| 2026-03-17 | [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) | Troubleshooting | | 2026-03-17 | [Ollama Drops Off Tailscale When Mac Sleeps](05-troubleshooting/ollama-macos-sleep-tailscale-disconnect.md) | Troubleshooting |
| 2026-03-17 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting | | 2026-03-17 | [Windows OpenSSH Server (sshd) Stops After Reboot](05-troubleshooting/networking/windows-sshd-stops-after-reboot.md) | Troubleshooting |