New 05-troubleshooting/networking article covering the per-host nature of authorized_keys: rotating a workstation SSH key requires backfilling the new pubkey to every host, or hosts holding only the old key reject it with Permission denied (publickey). Includes fleet-sweep diagnosis, idempotent backed-up backfill via a still-trusted transit user, and prevention. Wired into SUMMARY.md nav.
6.6 KiB
| title | domain | category | tags | status | created | updated | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SSH `Permission denied (publickey)` After Rotating a Key — Backfill Every `authorized_keys` | selfhosting | troubleshooting |
|
published | 2026-06-17 | 2026-06-17 |
SSH Permission denied (publickey) After Rotating a Key — Backfill Every authorized_keys
The Problem
A host you've SSH'd into for months suddenly rejects you — but only some hosts, not all:
$ ssh root@host-a
root@host-a: Permission denied (publickey).
$ ssh root@host-b # same key, same workstation — works fine
host-b $
Nothing changed on the servers. The thing that changed is on your side: at some
point the workstation's SSH key was regenerated (lost laptop, rebuild, a key file
clobbered by a botched copy, a routine rotation). The new public key was pushed to a
few hosts but never fanned out to the rest. Every host still holding only the old
public key now rejects the new private key with Permission denied (publickey).
The tell: it's
Permission denied (publickey), notHost key verification failed. The former is an authorization failure (the server doesn't trust your key); the latter is the server's key not matching yourknown_hosts. Different problem — see SSH Alias Falls Through to MagicDNS — Host-Key Verification Failure.
Why It Happens
Public-key auth is per-host: the server only lets you in if your public key is a
line in that host's ~/.ssh/authorized_keys. There is no central directory — each
host is its own island. So when you rotate a key, every host needs the new public
key appended independently.
It's easy to do this partially without noticing. You regenerate the key, then over the next hour you happen to SSH into three boxes and (re-)deploy the key there as part of other work. Those three now trust the new key. The other six don't — and you won't find out until weeks later when you reach for one of them.
Confirm it's an authorization (key) failure and see which key is being offered:
$ ssh -v root@host-a 2>&1 | grep -E 'Offering|Authentications|Permission denied'
debug1: Offering public key: /home/you/.ssh/id_ed25519 ED25519 SHA256:XeY1/N9qwB…
debug1: Authentications that can continue: publickey
root@host-a: Permission denied (publickey).
The server offered you nothing but publickey, you offered your current key, and it
was refused → your key isn't in that host's authorized_keys.
Scope It First — Don't Fix One Host at a Time
The host you noticed is rarely the only one. Sweep the whole fleet in one pass before touching anything, so you fix the real set, not just the squeaky wheel:
for h in host-a host-b host-c host-d host-e host-f; do
r=$(ssh -o BatchMode=yes -o ConnectTimeout=8 root@"$h" 'echo OK' 2>&1 | tail -1)
echo "$h: $r"
done
BatchMode=yes suppresses password/passphrase prompts so a failure fails fast instead
of hanging. Anything that doesn't print OK needs the backfill.
The Fix
You need a second, still-trusted way onto each failing host to append the new key. Common transit options, best first:
- Another of your keys that still works (e.g. a config-management / automation
user whose key is authorized fleet-wide, ideally with
sudo). - Another workstation whose key those hosts still trust.
- The provider's web console / serial console as a last resort.
[!warning] A jump host only helps if it can reach the target "Bounce through a box that still trusts me" only works if that box's own key is in the target's
authorized_keys. A host can trust your key yet have no standing trust to a third host (and hit its ownHost key verification failedon the way). Test the full two-hop path before relying on it.
Using a fleet-wide automation user (deploy) with passwordless sudo as the transit,
append the new key idempotently, with a backup, to every failing host:
PUBKEY=$(cat ~/.ssh/id_ed25519.pub)
STAMP=$(date +%Y%m%d-%H%M%S)
for h in host-a host-c host-e; do # only the hosts that failed the sweep
ssh deploy@"$h" "sudo bash -s" <<EOF
set -e
F=/root/.ssh/authorized_keys
mkdir -p /root/.ssh && touch "\$F"
cp "\$F" "\$F.bak-$STAMP" # backup before any change
grep -qF "$PUBKEY" "\$F" || printf '%s\n' "$PUBKEY" >> "\$F" # append only if absent
chmod 600 "\$F"
EOF
done
Three things that keep this safe:
- Append, never overwrite.
>> "$F"and thegrep -qF … ||guard mean you add one line and only if it's missing. Re-running is a no-op — never clobber anauthorized_keyswith>or you'll lock out every other key on the box. - Back up first. The
.bak-<stamp>copy is your undo. chmod 600. SSH silently ignores anauthorized_keysthat's group/world writable, which looks exactly like "the key didn't take."
Then verify directly — not through the transit user:
for h in host-a host-c host-e; do
echo "$h: $(ssh -o BatchMode=yes root@"$h" 'echo OK' 2>&1 | tail -1)"
done
All OK means the new key authenticates on its own.
Prevention
- Treat rotation as fleet-wide. When a workstation key changes, the very next step
is to fan the new public key out to every host's
authorized_keysin one pass — not opportunistically as you happen to log in. A shortforloop over the full host list (or a config-management task — see below) closes the gap immediately. - Manage
authorized_keysdeclaratively. An Ansibleansible.posix.authorized_keytask (or equivalent) that lists the current set of keys makes "who can log in" a reviewed, version-controlled fact instead of an append-only pile that drifts per host. - Keep the old key authorized until the new one is verified everywhere, then remove the stale line in a deliberate cleanup pass.
How to Diagnose This (Checklist)
ssh -o BatchMode=yes <host> true→Permission denied (publickey)(auth), notHost key verification failed(host key). Confirms which problem you have.ssh -v <host> 2>&1 | grep Offering→ which private key is being offered, and its fingerprint.- Sweep the whole fleet with the
BatchModeloop → get the full list of affected hosts before fixing. - Append the new public key (idempotent, backed up,
chmod 600) via a still-trusted transit path. - Re-verify each host with a direct
BatchModelogin.
Related: SSH Config & Key Management and SSH Hardening Across a Fleet with Ansible.