wiki: add firewalld mail ports reset article + session updates
- New article: firewalld mail ports wiped after reload (IMAP + webmail outage) - New article: Plex 4K codec compatibility (Apple TV) - New article: mdadm RAID recovery after USB hub disconnect - Updated yt-dlp article - Updated all index files: SUMMARY.md, index.md, README.md, category indexes - Article count: 41 → 42 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
105
05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md
Normal file
105
05-troubleshooting/storage/mdadm-usb-hub-disconnect-recovery.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# mdadm RAID Recovery After USB Hub Disconnect
|
||||
|
||||
A software RAID array managed by mdadm can appear to catastrophically fail when the drives are connected via USB rather than SATA. The array is fine — the hub dropped out. Here's how to diagnose and recover.
|
||||
|
||||
## Symptoms
|
||||
|
||||
- rsync or other I/O to the RAID mount returns `Input/output error`
|
||||
- `cat /proc/mdstat` shows `broken raid0` or `FAILED`
|
||||
- `mdadm --detail /dev/md0` shows `State: broken, FAILED`
|
||||
- `lsblk` no longer lists the RAID member drives (e.g. `sdd`, `sde` gone)
|
||||
- XFS (or other filesystem) logs in dmesg:
|
||||
```
|
||||
XFS (md0): log I/O error -5
|
||||
XFS (md0): Filesystem has been shut down due to log error (0x2).
|
||||
```
|
||||
- `smartctl -H /dev/sdd` returns `No such device`
|
||||
|
||||
## Why It Happens
|
||||
|
||||
If your RAID drives are in a USB enclosure (e.g. TerraMaster via ASMedia hub), a USB disconnect — triggered by a power fluctuation, plugging in another device, or a hub reset — causes mdadm to see the drives disappear. mdadm cannot distinguish a USB dropout from a physical drive failure, so it declares the array failed.
|
||||
|
||||
The failure message in dmesg will show `hostbyte=DID_ERROR` rather than a drive-level error:
|
||||
|
||||
```
|
||||
md/raid0md0: Disk failure on sdd1 detected, failing array.
|
||||
sd X:0:0:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
|
||||
```
|
||||
|
||||
`DID_ERROR` means the SCSI host adapter (USB controller) reported the error — the drives themselves are likely fine.
|
||||
|
||||
## Diagnosis
|
||||
|
||||
### 1. Check if the USB hub recovered
|
||||
|
||||
```bash
|
||||
lsblk -o NAME,SIZE,TYPE,FSTYPE,MODEL
|
||||
```
|
||||
|
||||
After a hub reconnects, drives will reappear — often with **new device names** (e.g. `sdd`/`sde` become `sdg`/`sdh`). Look for drives with `linux_raid_member` filesystem type.
|
||||
|
||||
```bash
|
||||
dmesg | grep -iE 'usb|disconnect|DID_ERROR' | tail -30
|
||||
```
|
||||
|
||||
A hub dropout looks like multiple devices disconnecting at the same time on the same USB port.
|
||||
|
||||
### 2. Confirm drives have intact superblocks
|
||||
|
||||
```bash
|
||||
mdadm --examine /dev/sdg1
|
||||
mdadm --examine /dev/sdh1
|
||||
```
|
||||
|
||||
If the superblocks are present and show matching UUID/array info, the data is intact.
|
||||
|
||||
## Recovery
|
||||
|
||||
### 1. Unmount and stop the degraded array
|
||||
|
||||
```bash
|
||||
umount /majorRAID # or wherever md0 is mounted
|
||||
mdadm --stop /dev/md0
|
||||
```
|
||||
|
||||
If umount fails due to a busy mount or already-failed filesystem, it may already be unmounted by the kernel. Proceed with `--stop`.
|
||||
|
||||
### 2. Reassemble with the new device names
|
||||
|
||||
```bash
|
||||
mdadm --assemble /dev/md0 /dev/sdg1 /dev/sdh1
|
||||
```
|
||||
|
||||
mdadm matches drives by their superblock UUID, not device name. As long as both drives are present the assembly will succeed regardless of what they're called.
|
||||
|
||||
### 3. Mount and verify
|
||||
|
||||
```bash
|
||||
mount /dev/md0 /majorRAID
|
||||
df -h /majorRAID
|
||||
ls /majorRAID
|
||||
```
|
||||
|
||||
If the filesystem mounts and data is visible, recovery is complete.
|
||||
|
||||
### 4. Create or update /etc/mdadm.conf
|
||||
|
||||
If `/etc/mdadm.conf` doesn't exist (or references old device names), update it:
|
||||
|
||||
```bash
|
||||
mdadm --detail --scan > /etc/mdadm.conf
|
||||
cat /etc/mdadm.conf
|
||||
```
|
||||
|
||||
The output uses UUID rather than device names — the array will reassemble correctly on reboot even if drive letters change again.
|
||||
|
||||
## Prevention
|
||||
|
||||
The root cause is drives on USB rather than SATA. Short of moving the drives to a SATA controller, options are limited. When planning a migration off the RAID array (e.g. to SnapRAID + MergerFS), prioritize getting drives onto SATA connections.
|
||||
|
||||
> [!warning] RAID 0 has no redundancy. A USB dropout that causes the array to fail mid-write could corrupt data even if the drives themselves are healthy. Keep current backups before any maintenance involving the enclosure.
|
||||
|
||||
## Related
|
||||
|
||||
- [SnapRAID & MergerFS Storage Setup](../../01-linux/storage/snapraid-mergerfs-setup.md)
|
||||
- [rsync Backup Patterns](../../02-selfhosting/storage-backup/rsync-backup-patterns.md)
|
||||
Reference in New Issue
Block a user