wiki: ssh.socket wait-ready gate + mastodon post-install hardening
Two related additions covering the 2026-05-31 cutover-night incidents on majorlinux and majortoot-hetzner. ssh-socket-tailscale-race-condition.md (update Race 1 fix): - After=tailscaled.service Requires=tailscaled.service orders against the service becoming active, not against tailscale0 having an IPv4 — hosts kept losing SSH intermittently after reboots (incident: majorlinux + majortoot-hetzner 2026-05-31, during cutover-night Ansible reboot). - Canonical fix: a oneshot tailscale-wait-ready.service that polls `ip -4 -o addr show tailscale0` until an address is present, with ssh.socket After=/Requires= that service. Document the full evolution (2026-05-19 BindsTo → 2026-05-23 Requires → 2026-05-31 wait-ready) so future readers don't try the half-fixes thinking they're sufficient. - Add majortoot-hetzner to affected hosts. mastodon-post-install-hardening.md (new): Four upstream-install gaps that bit during the majortoot-hetzner cutover: 1. /home/mastodon at 0750 (useradd default) → nginx www-data can't traverse → every static asset 403s → unstyled "purple screen" in the browser while API/HTML still work through the puma proxy. 2. .env.production at 0644 (mastodon-setup default) → DB_PASS, SECRET_KEY_BASE, OTP_SECRET world-readable once gap (1) is fixed. 3. mastodon user shell at /usr/sbin/nologin → `su - mastodon` blocked. 4. rbenv init in .bashrc only → login shells don't source .bashrc; even when chained, Ubuntu's .bashrc returns early for non-interactive shells. Fix: .bash_profile sets up rbenv BEFORE sourcing .profile + .bashrc, so it works for both interactive and non-interactive logins. All four codified in MajorAnsible configure_mastodon_permissions.yml with self-asserting verification steps. 02-selfhosting/index.md + SUMMARY.md: Add a "Services" section to the selfhosting index linking the mastodon-post-install-hardening article (and the other orphaned services/ entries while there). SUMMARY.md gains one new entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
73c10111e0
commit
155651c373
4 changed files with 222 additions and 11 deletions
|
|
@ -1,6 +1,6 @@
|
||||||
---
|
---
|
||||||
created: 2026-04-13T10:15
|
created: 2026-04-13T10:15
|
||||||
updated: 2026-04-30T05:21
|
updated: 2026-05-31
|
||||||
---
|
---
|
||||||
# 🏠 Self-Hosting & Homelab
|
# 🏠 Self-Hosting & Homelab
|
||||||
|
|
||||||
|
|
@ -30,6 +30,17 @@ Guides for running your own services at home, including Docker, reverse proxies,
|
||||||
- [Tuning Netdata Docker Health Alarms](monitoring/netdata-docker-health-alarm-tuning.md)
|
- [Tuning Netdata Docker Health Alarms](monitoring/netdata-docker-health-alarm-tuning.md)
|
||||||
- [Deploying Netdata to a New Server](monitoring/netdata-new-server-setup.md)
|
- [Deploying Netdata to a New Server](monitoring/netdata-new-server-setup.md)
|
||||||
|
|
||||||
|
## Services
|
||||||
|
|
||||||
|
- [Mastodon Instance Tuning](services/mastodon-instance-tuning.md)
|
||||||
|
- [Mastodon Post-Install Hardening (Permissions + Account)](services/mastodon-post-install-hardening.md)
|
||||||
|
- [Mastodon DB Maintenance](services/mastodon-db-maintenance.md)
|
||||||
|
- [Mastodon Federation](services/mastodon-federation.md)
|
||||||
|
- [Mastodon `--prune-profiles` Trap](services/mastodon-prune-profiles-trap.md)
|
||||||
|
- [Ghost SMTP via Mailgun](services/ghost-smtp-mailgun-setup.md)
|
||||||
|
- [Updating n8n Docker](services/updating-n8n-docker.md)
|
||||||
|
- [Claude Code Remote Control](services/claude-code-remote-control.md)
|
||||||
|
|
||||||
## Security
|
## Security
|
||||||
|
|
||||||
- [Linux Server Hardening Checklist](security/linux-server-hardening-checklist.md)
|
- [Linux Server Hardening Checklist](security/linux-server-hardening-checklist.md)
|
||||||
|
|
|
||||||
174
02-selfhosting/services/mastodon-post-install-hardening.md
Normal file
174
02-selfhosting/services/mastodon-post-install-hardening.md
Normal file
|
|
@ -0,0 +1,174 @@
|
||||||
|
---
|
||||||
|
title: Mastodon Post-Install Hardening (Permissions + Account)
|
||||||
|
domain: selfhosting
|
||||||
|
category: services
|
||||||
|
tags:
|
||||||
|
- mastodon
|
||||||
|
- fediverse
|
||||||
|
- self-hosting
|
||||||
|
- hardening
|
||||||
|
- ansible
|
||||||
|
- nginx
|
||||||
|
- rbenv
|
||||||
|
status: published
|
||||||
|
created: 2026-05-31
|
||||||
|
updated: 2026-05-31
|
||||||
|
---
|
||||||
|
|
||||||
|
# Mastodon Post-Install Hardening (Permissions + Account)
|
||||||
|
|
||||||
|
Four gaps that the upstream Mastodon install guide doesn't lock down — each silently breaks something or leaves a credential exposed. Found on majortoot-hetzner during its 2026-05-31 cutover; codified in MajorAnsible's `configure_mastodon_permissions.yml`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gap 1: `/home/mastodon` is `0750` — nginx 403s every asset
|
||||||
|
|
||||||
|
### Symptom
|
||||||
|
|
||||||
|
Browser loads `https://<your-instance>/` and shows an unstyled **purple background with no content** (Mastodon's React entry HTML loaded, but every JS / CSS / manifest request 403'd). API endpoints like `/api/v1/instance` still return 200 because they fall through nginx's `try_files` to the puma proxy — but static assets need direct filesystem access.
|
||||||
|
|
||||||
|
### Cause
|
||||||
|
|
||||||
|
Debian/Ubuntu's `useradd` default umask creates `/home/<user>` as `0750` (owner+group only). nginx runs as `www-data`, which is in neither — it cannot **traverse** into `/home/mastodon/live/public/` to serve `packs/assets/*.js`, manifest.json, etc. The errors land in `/var/log/nginx/error.log`:
|
||||||
|
|
||||||
|
```
|
||||||
|
[crit] stat() "/home/mastodon/live/public/packs/assets/foo.js" failed (13: Permission denied)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Fix
|
||||||
|
|
||||||
|
```bash
|
||||||
|
chmod 0751 /home/mastodon
|
||||||
|
```
|
||||||
|
|
||||||
|
`0751` gives `other` execute (traversal) only, **not read** — files inside that aren't world-readable stay private. Take the opportunity to lock `.env.production` in the next gap.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gap 2: `.env.production` is `0644` — DB_PASS and SECRET_KEY_BASE are world-readable
|
||||||
|
|
||||||
|
### Symptom
|
||||||
|
|
||||||
|
Once Gap 1 is fixed and `/home/mastodon` is traversable, any local user (and any compromised process running as nginx, sidekiq under reduced privileges, a container escape, etc.) can `cat /home/mastodon/live/.env.production` and read every Mastodon secret.
|
||||||
|
|
||||||
|
### Cause
|
||||||
|
|
||||||
|
The `mastodon-setup` interactive wizard writes `.env.production` with default `0644` permissions. The file contains:
|
||||||
|
|
||||||
|
- `DB_PASS` — PostgreSQL password
|
||||||
|
- `SECRET_KEY_BASE` — session cookie signing key
|
||||||
|
- `OTP_SECRET` — 2FA encryption key
|
||||||
|
- SMTP credentials
|
||||||
|
- S3 / object-storage credentials if configured
|
||||||
|
|
||||||
|
### Fix
|
||||||
|
|
||||||
|
```bash
|
||||||
|
chmod 0600 /home/mastodon/live/.env.production
|
||||||
|
chown mastodon:mastodon /home/mastodon/live/.env.production
|
||||||
|
```
|
||||||
|
|
||||||
|
No service restart needed — Rails reads `.env.production` at process boot, not per-request. Existing `puma`, `sidekiq`, and `streaming` services keep running.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gap 3: `mastodon` user shell is `/usr/sbin/nologin` — `su - mastodon` fails
|
||||||
|
|
||||||
|
### Symptom
|
||||||
|
|
||||||
|
```
|
||||||
|
root@majortoot:~# su - mastodon
|
||||||
|
This account is currently not available.
|
||||||
|
```
|
||||||
|
|
||||||
|
Blocks all `tootctl` and Rails console admin via SSH.
|
||||||
|
|
||||||
|
### Cause
|
||||||
|
|
||||||
|
If the user was created with `useradd --system mastodon`, the system-account default is shell `/usr/sbin/nologin`. Mastodon's own installer typically sets `/bin/bash` but a manual / Ansible / Packer build path may have used `--system`.
|
||||||
|
|
||||||
|
### Fix
|
||||||
|
|
||||||
|
```bash
|
||||||
|
usermod -s /bin/bash mastodon
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify with `getent passwd mastodon | cut -d: -f7` → `/bin/bash`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Gap 4: Login shells don't load rbenv — `tootctl` reports "ruby: command not found"
|
||||||
|
|
||||||
|
### Symptom
|
||||||
|
|
||||||
|
After fixing Gap 3, `su - mastodon` succeeds, but:
|
||||||
|
|
||||||
|
```
|
||||||
|
mastodon@majortoot:~$ which ruby
|
||||||
|
(no output, exit 1)
|
||||||
|
mastodon@majortoot:~$ cd /home/mastodon/live && bin/tootctl version
|
||||||
|
/usr/bin/env: 'ruby': No such file or directory
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cause
|
||||||
|
|
||||||
|
A typical Mastodon install puts rbenv init in `~/.bashrc`. But bash **login** shells (which `su -` and `ssh user@host` open) source `.bash_profile`, `.bash_login`, or `.profile` in that order — **not** `.bashrc`. If `.bash_profile` doesn't exist and `.profile` doesn't init rbenv, the login shell never gets rbenv on PATH.
|
||||||
|
|
||||||
|
Even when `.bash_profile` chains `.bashrc`, Ubuntu's default `.bashrc` has a guard at the top:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
case $- in
|
||||||
|
*i*) ;;
|
||||||
|
*) return;;
|
||||||
|
esac
|
||||||
|
```
|
||||||
|
|
||||||
|
This **returns early for non-interactive shells**, which is exactly what `su - mastodon -c "<command>"` opens — so the rbenv init lines later in `.bashrc` are never reached.
|
||||||
|
|
||||||
|
### Fix
|
||||||
|
|
||||||
|
Drop a `.bash_profile` that sets up rbenv **before** sourcing `.bashrc`, so it works for both interactive and non-interactive login shells:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# /home/mastodon/.bash_profile (mode 0644, owned by mastodon:mastodon)
|
||||||
|
export PATH="$HOME/.rbenv/bin:$HOME/.rbenv/shims:$PATH"
|
||||||
|
if command -v rbenv >/dev/null 2>&1; then
|
||||||
|
eval "$(rbenv init -)"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Then load POSIX login env + bash interactive config
|
||||||
|
[ -f ~/.profile ] && . ~/.profile
|
||||||
|
[ -f ~/.bashrc ] && . ~/.bashrc
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
su - mastodon -c "ruby -v" # → ruby 3.x.x …
|
||||||
|
su - mastodon -c "cd /home/mastodon/live && RAILS_ENV=production bin/tootctl version"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Codified
|
||||||
|
|
||||||
|
All four gaps are handled by `configure_mastodon_permissions.yml` in MajorAnsible. The playbook is idempotent, requires no service restart, and includes self-asserting verification steps:
|
||||||
|
|
||||||
|
| Assertion | What it catches |
|
||||||
|
|---|---|
|
||||||
|
| `sudo -u www-data stat /home/mastodon/live/public/packs` must succeed | Gap 1 regression |
|
||||||
|
| `sudo -u www-data cat .env.production` must fail | Gap 2 regression |
|
||||||
|
| `su - mastodon -c "ruby -v"` must succeed and output "ruby" | Gap 3 or 4 regression |
|
||||||
|
|
||||||
|
Apply to all Mastodon hosts:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ansible-playbook configure_mastodon_permissions.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [[majortoot#2026-05-31 — ssh.socket race post-reboot on majortoot-hetzner (during cutover night)]]
|
||||||
|
- [[majortoot#tootctl CLI Note]]
|
||||||
|
- MajorAnsible: `configure_mastodon_permissions.yml`
|
||||||
|
- Related: [[mastodon-instance-tuning|Mastodon Instance Tuning]] · [[mastodon-db-maintenance|Mastodon DB Maintenance]]
|
||||||
|
|
@ -27,38 +27,61 @@ journalctl -b -1 -u ssh # likely empty — sshd never spawned
|
||||||
journalctl -b -1 -u ssh.socket # socket started before tailscaled
|
journalctl -b -1 -u ssh.socket # socket started before tailscaled
|
||||||
```
|
```
|
||||||
|
|
||||||
### Fix
|
### Fix (current — 2026-05-31)
|
||||||
|
|
||||||
Add Tailscale dependency to the socket override:
|
`After=tailscaled.service` orders against the service becoming `active` — **not** against the `tailscale0` interface actually having an IPv4 address. tailscaled flips to active within a second of starting, but the kernel doesn't have the address bound to the interface until DERP relays connect and the control plane confirms the node. ssh.socket attempting `ListenStream=<TS IP>:22` in that window fails with `Cannot assign requested address`, the socket goes into a failed state, and there is no automatic retry.
|
||||||
|
|
||||||
|
The proper gate is a dedicated readiness service that **waits for the tailscale0 IPv4 address to exist** before letting ssh.socket bind:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
# /etc/systemd/system/tailscale-wait-ready.service
|
||||||
|
[Unit]
|
||||||
|
Description=Wait until tailscale0 has an IPv4 address
|
||||||
|
After=tailscaled.service
|
||||||
|
Requires=tailscaled.service
|
||||||
|
ConditionPathExists=/usr/sbin/ip
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
RemainAfterExit=yes
|
||||||
|
TimeoutStartSec=120
|
||||||
|
ExecStart=/usr/bin/bash -c 'for i in $(seq 1 120); do ip -4 -o addr show tailscale0 2>/dev/null | grep -q "inet " && exit 0; sleep 1; done; exit 1'
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
```
|
||||||
|
|
||||||
```ini
|
```ini
|
||||||
# /etc/systemd/system/ssh.socket.d/override.conf
|
# /etc/systemd/system/ssh.socket.d/override.conf
|
||||||
[Unit]
|
[Unit]
|
||||||
After=tailscaled.service
|
After=tailscale-wait-ready.service
|
||||||
Requires=tailscaled.service
|
Requires=tailscale-wait-ready.service
|
||||||
|
|
||||||
[Socket]
|
[Socket]
|
||||||
ListenStream=
|
ListenStream=
|
||||||
ListenStream=<TAILSCALE_IP>:22
|
ListenStream=<TAILSCALE_IP>:22
|
||||||
```
|
```
|
||||||
|
|
||||||
Then reload and restart:
|
Reload + restart:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
systemctl daemon-reload
|
systemctl daemon-reload
|
||||||
|
systemctl enable tailscale-wait-ready.service
|
||||||
systemctl restart ssh.socket
|
systemctl restart ssh.socket
|
||||||
systemctl status ssh.socket # verify Listen: shows correct IP
|
ss -tlnp | grep :22 # verify bound to Tailscale IP
|
||||||
```
|
```
|
||||||
|
|
||||||
- `After=` ensures the socket waits for Tailscale to start
|
!!! note "Evolution of this fix"
|
||||||
- `Requires=` ensures tailscaled must be running for the socket to activate
|
- **2026-05-19 v1** — `After=tailscaled.service` + `BindsTo=tailscaled.service`. Worked initially but caused a shutdown-time ordering cycle.
|
||||||
|
- **2026-05-23 v2** — `BindsTo` swapped for `Requires` to break the cycle. Fixed the cycle but did **not** wait for `tailscale0` to actually have an IP — just for `tailscaled` to be active. Hosts continued losing SSH after some reboots (intermittent, depending on whether the race won).
|
||||||
|
- **2026-05-31 v3** — Added `tailscale-wait-ready.service` to gate ssh.socket on the interface having an address. This is the current canonical fix.
|
||||||
|
|
||||||
!!! warning "Do NOT use BindsTo"
|
!!! warning "Do NOT use BindsTo"
|
||||||
`BindsTo=tailscaled.service` creates a **systemd ordering cycle** during shutdown: `basic.target → sockets.target → ssh.socket → tailscaled.service → basic.target`. Systemd breaks the cycle by deleting jobs unpredictably, which can prevent `ssh.socket` from starting on the next boot — leaving SSH dead until manual intervention. This was discovered on 2026-05-23 after the original fix (2026-05-19) used `BindsTo` and caused a second outage on dcaprod-hetzner. `Requires` provides the startup dependency without the dangerous bidirectional lifecycle coupling.
|
`BindsTo=tailscaled.service` creates a **systemd ordering cycle** during shutdown: `basic.target → sockets.target → ssh.socket → tailscaled.service → basic.target`. Systemd breaks the cycle by deleting jobs unpredictably, which can prevent `ssh.socket` from starting on the next boot. Use `Requires=` for startup ordering without the bidirectional lifecycle coupling.
|
||||||
|
|
||||||
### Affected Hosts
|
### Affected Hosts
|
||||||
|
|
||||||
Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner. Fedora hosts (majordiscord) use firewall rules for SSH restriction — not affected by this race.
|
Ubuntu hosts using `configure_tailscale_ssh_only.yml`: majorlinux, dcaprod-hetzner, tttpod-hetzner, majortoot-hetzner. Fedora hosts (majordiscord) use firewall rules for SSH restriction — not affected by this race.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -120,4 +143,6 @@ All hosts where Tailscale is the primary access path. Particularly impactful on
|
||||||
- [[majordiscord#2026-05-19 — Tailscale boot race: unreachable after Ansible reboot]]
|
- [[majordiscord#2026-05-19 — Tailscale boot race: unreachable after Ansible reboot]]
|
||||||
- [[majorlinux#2026-05-19 — ssh.socket override patched: added Tailscale dependency]]
|
- [[majorlinux#2026-05-19 — ssh.socket override patched: added Tailscale dependency]]
|
||||||
- [[dcaprod#2026-05-23 — SSH unreachable again: BindsTo ordering cycle in ssh.socket override]]
|
- [[dcaprod#2026-05-23 — SSH unreachable again: BindsTo ordering cycle in ssh.socket override]]
|
||||||
|
- [[majorlinux#2026-05-31 — ssh.socket race recurrence post-reboot (Requires= insufficient; added wait-ready gate)]]
|
||||||
|
- [[majortoot#2026-05-31 — ssh.socket race post-reboot on majortoot-hetzner (during cutover night)]]
|
||||||
- Ansible: `configure_tailscale_ssh_only.yml`, `configure_tailscale_network_wait.yml`
|
- Ansible: `configure_tailscale_ssh_only.yml`, `configure_tailscale_network_wait.yml`
|
||||||
|
|
|
||||||
|
|
@ -38,6 +38,7 @@ updated: 2026-05-15T09:00
|
||||||
* [Logwatch Fleet Setup — Surviving Package Upgrades](02-selfhosting/monitoring/logwatch-fleet-setup.md)
|
* [Logwatch Fleet Setup — Surviving Package Upgrades](02-selfhosting/monitoring/logwatch-fleet-setup.md)
|
||||||
* [Updating n8n Running in Docker](02-selfhosting/services/updating-n8n-docker.md)
|
* [Updating n8n Running in Docker](02-selfhosting/services/updating-n8n-docker.md)
|
||||||
* [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md)
|
* [Mastodon Instance Tuning](02-selfhosting/services/mastodon-instance-tuning.md)
|
||||||
|
* [Mastodon Post-Install Hardening (Permissions + Account)](02-selfhosting/services/mastodon-post-install-hardening.md)
|
||||||
* [Mastodon — The `--prune-profiles` Trap and How to Recover](02-selfhosting/services/mastodon-prune-profiles-trap.md)
|
* [Mastodon — The `--prune-profiles` Trap and How to Recover](02-selfhosting/services/mastodon-prune-profiles-trap.md)
|
||||||
* [Ghost Email Configuration with Mailgun](02-selfhosting/services/ghost-smtp-mailgun-setup.md)
|
* [Ghost Email Configuration with Mailgun](02-selfhosting/services/ghost-smtp-mailgun-setup.md)
|
||||||
* [Claude Code Remote Control — Mobile Access to a Persistent Host Session](02-selfhosting/services/claude-code-remote-control.md)
|
* [Claude Code Remote Control — Mobile Access to a Persistent Host Session](02-selfhosting/services/claude-code-remote-control.md)
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue