New article mastodon-s3-acl-upload-failures.md: a BucketOwnerEnforced S3 bucket plus a stale S3_PERMISSION/S3_ACL in .env.production makes every Mastodon upload fail with AccessControlListNotSupported, silently. Covers symptoms (incl. why a missing object returns 403 not 404), diagnosis, the fix (S3_PERMISSION= empty, public read via bucket policy), recovery, a synthetic-write health check, and Ansible enforcement. Extend mastodon-prune-profiles-trap.md: add a "Bulk restore at scale" procedure (list existing keys, null missing DB refs, enqueue RedownloadAvatar/HeaderWorker), a "storage-level deletion without DB de-ref" section, and a stronger recommendation to disable automated profile pruning (and scheduled accounts refresh --all) entirely. Link both from SUMMARY.md and the selfhosting index. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7.1 KiB
| title | description | domain | category | tags | status | created | updated | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mastodon on S3 — Silent Upload Failures When the Bucket Disables ACLs | Why a BucketOwnerEnforced S3 bucket plus a stale S3_PERMISSION/S3_ACL in .env.production makes every Mastodon media upload fail with AccessControlListNotSupported, how to diagnose it, and how to fix and monitor it. | selfhosting | services |
|
published | 2026-06-01 | 2026-06-01 |
Mastodon on S3 — Silent Upload Failures When the Bucket Disables ACLs
If your Mastodon instance stores media on S3 and you switch the bucket to Object Ownership = BucketOwnerEnforced (which AWS now recommends, and which the console nudges you toward), every media upload can start failing silently unless you also remove the object-ACL setting from .env.production. New avatars, headers, and attachments stop appearing; old ones keep working; nothing obvious is logged. This article is the diagnosis and fix.
TL;DR
BucketOwnerEnforceddisables ACLs entirely on the bucket. Any request that carries anx-amz-aclheader is rejected withAccessControlListNotSupported: The bucket does not allow ACLs.- Mastodon (via Paperclip) attaches
x-amz-aclto every upload ifS3_PERMISSION(orS3_ACL) is set in.env.production. The common valueS3_PERMISSION=public-read— or a migration leftover likeS3_PERMISSION=private— triggers the rejection. - Result: every new upload fails, but the database row is still updated, so Mastodon believes it has the file. The object never lands → broken image. Objects written before the bucket changed keep serving fine, which masks the problem.
- Fix: set
S3_PERMISSION=(empty) and remove anyS3_ACL=line, then restartmastodon-web+mastodon-sidekiq. Public read is now served by the bucket policy, not per-object ACLs.
Symptoms
- Newly-changed avatars/headers show broken; attachments on new posts fail to display.
- Avatars that were cached before the bucket setting changed still work — so "some work, some don't."
tootctland the web UI report success; Sidekiq doesn't obviously error.- Direct fetch of a broken object's URL returns 403 AccessDenied (not 404 — see below).
Why a missing object returns 403, not 404
A typical Mastodon S3 bucket policy grants public s3:GetObject but not s3:ListBucket. Without ListBucket, S3 hides whether a key exists: a GET on a missing key returns 403 AccessDenied, identical to a permissions denial. So "403" here usually means the object isn't there, not the object is forbidden. This is why the failure reads like a permissions problem when it's really a failed write.
Diagnosis
Run these with the instance's own S3 credentials (e.g. via bin/rails runner, which loads .env.production):
require "aws-sdk-s3"
c = Aws::S3::Client.new(region: ENV["S3_REGION"],
access_key_id: ENV["AWS_ACCESS_KEY_ID"],
secret_access_key: ENV["AWS_SECRET_ACCESS_KEY"])
b = ENV["S3_BUCKET"]
# 1. Is the bucket ACL-disabled?
puts c.get_bucket_ownership_controls(bucket: b).ownership_controls.rules.map(&:object_ownership).inspect
# => ["BucketOwnerEnforced"] <-- ACLs are OFF
# 2. Does an upload WITH an ACL fail, and WITHOUT one succeed?
begin
c.put_object(bucket: b, key: "tmp/acltest", body: "x", acl: "public-read")
puts "PUT+acl: OK"
rescue => e
puts "PUT+acl FAILS: #{e.class} / #{e.message}" # AccessControlListNotSupported
end
c.put_object(bucket: b, key: "tmp/noacltest", body: "x") # succeeds
c.delete_object(bucket: b, key: "tmp/noacltest")
# 3. Confirm a "broken" avatar's object is actually missing
key = Account.find_by(username: "someuser", domain: "remote.tld").avatar.path.sub(%r{^/}, "")
begin; c.head_object(bucket: b, key: key); puts "EXISTS"
rescue Aws::S3::Errors::NotFound; puts "MISSING"; end
If #1 shows BucketOwnerEnforced and #2 shows the ACL'd PUT failing while the plain PUT succeeds, you've confirmed it.
Check .env.production for the offending settings:
grep -E '^S3_(ACL|PERMISSION|NO_INHERIT)' /home/mastodon/live/.env.production
# S3_ACL=private <-- remove
# S3_PERMISSION=private <-- set empty
The fix
- Edit
.env.production:S3_PERMISSION=(empty — Paperclip then sends nox-amz-aclheader)- remove/comment any
S3_ACL=line
- Restart so the env is reloaded:
systemctl restart mastodon-sidekiq mastodon-web - Verify the previously-failing write path now works — reprocess any existing avatar and confirm it serves 200:
a = Account.local.first
a.avatar.reprocess! # used to raise AccessControlListNotSupported; now succeeds
Public readability is now provided by the bucket policy (grant s3:GetObject on arn:aws:s3:::your-bucket/* to Principal: "*"), with the account-level Block Public Access "ACLs" toggles off and "policy" allowed. You do not need per-object ACLs at all.
Recovering the avatars that broke while it was failing
Any media that failed to upload during the broken window is gone from S3 while the DB still references it. Because Mastodon's redownload workers skip accounts whose *_file_name is already set, you must null the dead reference first, then enqueue the worker. See Mastodon — The --prune-profiles Trap and How to Recover for the bulk procedure.
Don't let it happen silently again — monitor uploads
The worst part of this bug is the silence. Add a periodic synthetic write check that uploads a tiny object with the app's own credentials, confirms it, deletes it, and alerts on failure:
s3.put_object(bucket: b, key: "health/upload-check", body: "ok") # no acl
s3.head_object(bucket: b, key: "health/upload-check")
s3.delete_object(bucket: b, key: "health/upload-check")
# any exception -> email an alert
Pair it with an HTTP check that your local account avatars all return 200 (they always should). Run both every few hours from cron. A regression then pages you in hours instead of being discovered by a user weeks later.
Ansible enforcement
If you manage the host with Ansible, enforce the safe values so a future template render can't reintroduce the ACL header:
- name: Ensure S3_PERMISSION is empty (no x-amz-acl on uploads)
ansible.builtin.lineinfile:
path: /home/mastodon/live/.env.production
regexp: '^S3_PERMISSION='
line: 'S3_PERMISSION='
notify: Restart Mastodon services
- name: Remove any active S3_ACL line (ACLs unsupported on this bucket)
ansible.builtin.lineinfile:
path: /home/mastodon/live/.env.production
regexp: '^S3_ACL=.+'
state: absent
notify: Restart Mastodon services
Related
- Mastodon — The
--prune-profilesTrap and How to Recover — the other way avatars go missing, plus the bulk-restore script - Mastodon Post-Install Hardening (Permissions + Account)
- AWS S3 Cost Management — pruning attachments to control bucket size (safely)