majorwiki/02-selfhosting/services/mastodon-db-maintenance.md
majorlinux 91455fac39 Add 7 articles; update nav and existing articles (2026-04-25)
New articles:
- pihole-doh-dot-bypass-defense
- pihole-v6-adlist-management
- mastodon-db-maintenance
- mastodon-federation
- fantastical-google-phantom-calendar-syncselect
- rsync-tailscale-teardown-stall
- ollama-chat-template-pipe-stdin-bypass

Updated: wsl2-backup, wsl2-rebuild, ssh-config-key-management,
selfhosting index, mastodon-instance-tuning, ansible-check-mode,
windows-openssh, windows-sshd, yt-dlp, README, SUMMARY, index
Removed: fedora-usrmerge-ebtables-blocker (superseded by prior push)
2026-04-25 17:52:48 +00:00

143 lines
4.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Mastodon DB Maintenance — Statuses, Accounts, and VACUUM
domain: selfhosting
category: services
tags:
- mastodon
- database
- postgresql
- maintenance
- tootctl
- majortoot
status: published
created: 2026-04-22
updated: 2026-04-22
---
# Mastodon DB Maintenance
Mastodon aggressively caches remote content — avatars, statuses, follow graphs — from every instance it federates with. On an active instance, this causes substantial PostgreSQL bloat over time. Without periodic maintenance, the database grows unbounded even if S3 handles media.
## The Problem — majortoot at ~3.5 years
| Table | Size | Rows |
|-------|------|------|
| `statuses` | 3.5 GB | 3.6M rows (3.6M remote cached, 37K local) |
| `accounts` | 499 MB | 214,770 remote cached, 18 local |
| `preview_cards` | 837 MB | remote link previews |
| `statuses_tags` | 506 MB | cascades from statuses |
| `conversations` | 436 MB | cascades from statuses |
| `mentions` | 305 MB | cascades from statuses |
The `statuses remove` and `accounts cull` commands address most of this.
---
## Maintenance Tasks
### 1. Cache Clear
Clears in-memory Rails caches. Fast (<5 seconds), safe to run anytime.
```bash
tootctl cache clear
```
### 2. Statuses Remove
Removes cached remote statuses (and their cascaded rows in `statuses_tags`, `mentions`, `conversations`, `status_stats`) older than N days. Does **not** touch local statuses.
```bash
tootctl statuses remove --days=90
```
> [!warning] This is the slowest step
> On a 3.6M-row statuses table, the extraction phase alone can take 2040 minutes. PostgreSQL will be under heavy load. Run during off-peak hours.
**What gets removed:** Remote statuses not pinned, not boosted by local users, and not replied to by local users, older than the threshold.
### 3. Accounts Cull
Contacts each remote account's home instance via WebFinger to check if it still exists. Removes accounts that return 404 or `Gone`. Catches dead instances, deleted accounts, and renamed handles.
```bash
tootctl accounts cull
```
> [!note] Network-bound
> Cull makes HTTP requests to remote servers. It's slower on flaky network conditions and will skip accounts it can't reach (to avoid false deletions).
### 4. VACUUM ANALYZE
After large deletions, PostgreSQL does not immediately return space to the OS dead rows sit in pages marked for reuse. `VACUUM ANALYZE` reclaims that space and updates query planner statistics.
```bash
sudo -u postgres psql mastodon_production -c "VACUUM ANALYZE;"
```
For recovering actual disk space (not just marking pages free), `VACUUM FULL` is more aggressive but locks tables. Stick with plain `VACUUM ANALYZE` for routine maintenance.
---
## The Maintenance Script
**Location:** `/home/mastodon/maintenance.sh`
**Cron:** `0 2 * * 0` Sunday 2 AM (runs before media prune at 3 AM)
**Log:** `/var/log/mastodon/maintenance.log`
**Notifications:** Email to `marcus@majorshouse.com` at each step via Postfix MajorMail
The script runs all four tasks in sequence and sends a notification email:
- **On start** lists steps and current DB size
- **After cache clear** confirms done, warns statuses remove will take a while
- **After statuses remove** summary output + current DB size
- **After accounts cull** accounts removed + current DB size
- **On completion** full timing breakdown and final DB size
### Running Manually
```bash
ssh root@100.110.197.17
bash /home/mastodon/maintenance.sh
```
### Monitoring Progress
```bash
ssh root@100.110.197.17 "tail -f /var/log/mastodon/maintenance.log"
```
### tootctl Wrapper (one-off commands)
The `mastodon` user's rbenv is not on PATH in a login shell. Always use the wrapper:
```bash
su - mastodon -c 'export PATH="/home/mastodon/.rbenv/bin:/home/mastodon/.rbenv/shims:$PATH" && eval "$(rbenv init -)" && cd /home/mastodon/live && RAILS_ENV=production bin/tootctl <command>'
```
---
## Full Cron Schedule on majortoot
| Time | Job | Script |
|------|-----|--------|
| Sun 2 AM | DB maintenance | `/home/mastodon/maintenance.sh` |
| Sun 3 AM | Media prune (S3) | `/home/mastodon/media-prune.sh` |
| Daily 8 AM | Fail2Ban digest | `/usr/local/bin/fail2ban-digest.sh` |
| Monthly | Fail2Ban nginx-botsearch prune | `/usr/local/bin/f2b-prune.sh` |
| Daily | Certbot renewal | `service nginx stop; certbot renew; service nginx start` |
---
## First Run Results (2026-04-22)
First maintenance run ever on majortoot after ~3.5 years of operation. Results pending (job running in background at time of writing). Check `/var/log/mastodon/maintenance.log` for final numbers.
---
## See Also
- [[Mastodon]] service doc (deployment, access, S3 config)
- [[majortoot]] server doc (incident log, specs)
- [[mastodon-federation]] domain blocks, silencing, FediSeer
- [[mastodon-instance-tuning]] character limits, media cache