New article mastodon-s3-acl-upload-failures.md: a BucketOwnerEnforced S3 bucket plus a stale S3_PERMISSION/S3_ACL in .env.production makes every Mastodon upload fail with AccessControlListNotSupported, silently. Covers symptoms (incl. why a missing object returns 403 not 404), diagnosis, the fix (S3_PERMISSION= empty, public read via bucket policy), recovery, a synthetic-write health check, and Ansible enforcement. Extend mastodon-prune-profiles-trap.md: add a "Bulk restore at scale" procedure (list existing keys, null missing DB refs, enqueue RedownloadAvatar/HeaderWorker), a "storage-level deletion without DB de-ref" section, and a stronger recommendation to disable automated profile pruning (and scheduled accounts refresh --all) entirely. Link both from SUMMARY.md and the selfhosting index. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
138 lines
7.1 KiB
Markdown
138 lines
7.1 KiB
Markdown
---
|
|
title: Mastodon on S3 — Silent Upload Failures When the Bucket Disables ACLs
|
|
description: Why a BucketOwnerEnforced S3 bucket plus a stale S3_PERMISSION/S3_ACL in .env.production makes every Mastodon media upload fail with AccessControlListNotSupported, how to diagnose it, and how to fix and monitor it.
|
|
domain: selfhosting
|
|
category: services
|
|
tags:
|
|
- mastodon
|
|
- fediverse
|
|
- self-hosting
|
|
- aws
|
|
- s3
|
|
- paperclip
|
|
- troubleshooting
|
|
status: published
|
|
created: 2026-06-01
|
|
updated: 2026-06-01
|
|
---
|
|
|
|
# Mastodon on S3 — Silent Upload Failures When the Bucket Disables ACLs
|
|
|
|
If your Mastodon instance stores media on S3 and you switch the bucket to **Object Ownership = `BucketOwnerEnforced`** (which AWS now recommends, and which the console nudges you toward), every media upload can start failing **silently** unless you also remove the object-ACL setting from `.env.production`. New avatars, headers, and attachments stop appearing; old ones keep working; nothing obvious is logged. This article is the diagnosis and fix.
|
|
|
|
## TL;DR
|
|
|
|
- `BucketOwnerEnforced` **disables ACLs entirely** on the bucket. Any request that carries an `x-amz-acl` header is rejected with `AccessControlListNotSupported: The bucket does not allow ACLs`.
|
|
- Mastodon (via Paperclip) attaches `x-amz-acl` to every upload **if** `S3_PERMISSION` (or `S3_ACL`) is set in `.env.production`. The common value `S3_PERMISSION=public-read` — or a migration leftover like `S3_PERMISSION=private` — triggers the rejection.
|
|
- Result: **every new upload fails**, but the database row is still updated, so Mastodon believes it has the file. The object never lands → broken image. Objects written *before* the bucket changed keep serving fine, which masks the problem.
|
|
- **Fix:** set `S3_PERMISSION=` (empty) and remove any `S3_ACL=` line, then restart `mastodon-web` + `mastodon-sidekiq`. Public read is now served by the **bucket policy**, not per-object ACLs.
|
|
|
|
## Symptoms
|
|
|
|
- Newly-changed avatars/headers show broken; attachments on new posts fail to display.
|
|
- Avatars that were cached **before** the bucket setting changed still work — so "some work, some don't."
|
|
- `tootctl` and the web UI report success; Sidekiq doesn't obviously error.
|
|
- Direct fetch of a broken object's URL returns **403 AccessDenied** (not 404 — see below).
|
|
|
|
## Why a missing object returns 403, not 404
|
|
|
|
A typical Mastodon S3 bucket policy grants public `s3:GetObject` but **not** `s3:ListBucket`. Without `ListBucket`, S3 hides whether a key exists: a `GET` on a **missing** key returns **403 AccessDenied**, identical to a permissions denial. So "403" here usually means *the object isn't there*, not *the object is forbidden*. This is why the failure reads like a permissions problem when it's really a failed write.
|
|
|
|
## Diagnosis
|
|
|
|
Run these with the instance's own S3 credentials (e.g. via `bin/rails runner`, which loads `.env.production`):
|
|
|
|
```ruby
|
|
require "aws-sdk-s3"
|
|
c = Aws::S3::Client.new(region: ENV["S3_REGION"],
|
|
access_key_id: ENV["AWS_ACCESS_KEY_ID"],
|
|
secret_access_key: ENV["AWS_SECRET_ACCESS_KEY"])
|
|
b = ENV["S3_BUCKET"]
|
|
|
|
# 1. Is the bucket ACL-disabled?
|
|
puts c.get_bucket_ownership_controls(bucket: b).ownership_controls.rules.map(&:object_ownership).inspect
|
|
# => ["BucketOwnerEnforced"] <-- ACLs are OFF
|
|
|
|
# 2. Does an upload WITH an ACL fail, and WITHOUT one succeed?
|
|
begin
|
|
c.put_object(bucket: b, key: "tmp/acltest", body: "x", acl: "public-read")
|
|
puts "PUT+acl: OK"
|
|
rescue => e
|
|
puts "PUT+acl FAILS: #{e.class} / #{e.message}" # AccessControlListNotSupported
|
|
end
|
|
c.put_object(bucket: b, key: "tmp/noacltest", body: "x") # succeeds
|
|
c.delete_object(bucket: b, key: "tmp/noacltest")
|
|
|
|
# 3. Confirm a "broken" avatar's object is actually missing
|
|
key = Account.find_by(username: "someuser", domain: "remote.tld").avatar.path.sub(%r{^/}, "")
|
|
begin; c.head_object(bucket: b, key: key); puts "EXISTS"
|
|
rescue Aws::S3::Errors::NotFound; puts "MISSING"; end
|
|
```
|
|
|
|
If #1 shows `BucketOwnerEnforced` and #2 shows the ACL'd PUT failing while the plain PUT succeeds, you've confirmed it.
|
|
|
|
Check `.env.production` for the offending settings:
|
|
|
|
```bash
|
|
grep -E '^S3_(ACL|PERMISSION|NO_INHERIT)' /home/mastodon/live/.env.production
|
|
# S3_ACL=private <-- remove
|
|
# S3_PERMISSION=private <-- set empty
|
|
```
|
|
|
|
## The fix
|
|
|
|
1. Edit `.env.production`:
|
|
- `S3_PERMISSION=` (empty — Paperclip then sends no `x-amz-acl` header)
|
|
- remove/comment any `S3_ACL=` line
|
|
2. Restart so the env is reloaded: `systemctl restart mastodon-sidekiq mastodon-web`
|
|
3. Verify the previously-failing write path now works — reprocess any existing avatar and confirm it serves 200:
|
|
|
|
```ruby
|
|
a = Account.local.first
|
|
a.avatar.reprocess! # used to raise AccessControlListNotSupported; now succeeds
|
|
```
|
|
|
|
Public readability is now provided by the **bucket policy** (grant `s3:GetObject` on `arn:aws:s3:::your-bucket/*` to `Principal: "*"`), with the account-level **Block Public Access** "ACLs" toggles off and "policy" allowed. You do **not** need per-object ACLs at all.
|
|
|
|
### Recovering the avatars that broke while it was failing
|
|
|
|
Any media that failed to upload during the broken window is gone from S3 while the DB still references it. Because Mastodon's redownload workers **skip accounts whose `*_file_name` is already set**, you must null the dead reference first, then enqueue the worker. See [Mastodon — The `--prune-profiles` Trap and How to Recover](mastodon-prune-profiles-trap.md#bulk-restore-at-scale) for the bulk procedure.
|
|
|
|
## Don't let it happen silently again — monitor uploads
|
|
|
|
The worst part of this bug is the silence. Add a periodic **synthetic write check** that uploads a tiny object with the app's own credentials, confirms it, deletes it, and alerts on failure:
|
|
|
|
```ruby
|
|
s3.put_object(bucket: b, key: "health/upload-check", body: "ok") # no acl
|
|
s3.head_object(bucket: b, key: "health/upload-check")
|
|
s3.delete_object(bucket: b, key: "health/upload-check")
|
|
# any exception -> email an alert
|
|
```
|
|
|
|
Pair it with an HTTP check that your **local** account avatars all return 200 (they always should). Run both every few hours from cron. A regression then pages you in hours instead of being discovered by a user weeks later.
|
|
|
|
## Ansible enforcement
|
|
|
|
If you manage the host with Ansible, enforce the safe values so a future template render can't reintroduce the ACL header:
|
|
|
|
```yaml
|
|
- name: Ensure S3_PERMISSION is empty (no x-amz-acl on uploads)
|
|
ansible.builtin.lineinfile:
|
|
path: /home/mastodon/live/.env.production
|
|
regexp: '^S3_PERMISSION='
|
|
line: 'S3_PERMISSION='
|
|
notify: Restart Mastodon services
|
|
|
|
- name: Remove any active S3_ACL line (ACLs unsupported on this bucket)
|
|
ansible.builtin.lineinfile:
|
|
path: /home/mastodon/live/.env.production
|
|
regexp: '^S3_ACL=.+'
|
|
state: absent
|
|
notify: Restart Mastodon services
|
|
```
|
|
|
|
## Related
|
|
|
|
- [Mastodon — The `--prune-profiles` Trap and How to Recover](mastodon-prune-profiles-trap.md) — the other way avatars go missing, plus the bulk-restore script
|
|
- [Mastodon Post-Install Hardening (Permissions + Account)](mastodon-post-install-hardening.md)
|
|
- [AWS S3 Cost Management](../cloud/aws-s3-cost-management.md) — pruning attachments to control bucket size (safely)
|