The Article You’ll Wish You Read Before the Drive Failed
Every mdadm guide on the internet covers mdadm --create. You pick your drives, choose a RAID level, fire the command, and feel like you’ve done something. You have! You’ve also set a timer. Six months from now, probably at 2 AM, one of those drives is going to start acting up and you’re going to discover that creating the array was the easy part.
This is the operations guide. The part that covers what happens after the honeymoon — drive replacements, growing the array when you inevitably run out of space, converting RAID levels when you realize you bought drives large enough that RAID 5’s rebuild math makes you nervous, and the monitoring you forgot to enable on day one.
If you’re new to RAID concepts altogether, start with RAID 0/1/5 explained and RAID 6 vs RAID 10 first. If you’re deep in nested RAID territory, RAID 50/60 has you covered. This article assumes you already have a running array and something has gone sideways.
Full example: Clone the working files at github.com/KingPin/sumguy-examples/linux/mdadm-day-2-operations
Replacing a Failing Drive
The ideal scenario is that SMART tells you a drive is dying before it actually dies. If you’re running smartmontools, you’ll see reallocated sector counts climbing, pending sectors accumulating, or uncorrectable errors ticking up. The drive is still functional — it’s just whispering “replace me soon” in the universal language of slowly degrading storage.
This is the moment to act, not when the array goes degraded.
Step 1: Mark the drive as failed (even if it’s still technically spinning):
# Identify which drive is the problemcat /proc/mdstatmdadm --detail /dev/md0
# Mark it failed so mdadm stops using itmdadm /dev/md0 --fail /dev/sdbStep 2: Remove it from the array:
mdadm /dev/md0 --remove /dev/sdbAt this point your array is degraded but still running. Check it:
cat /proc/mdstatmd0 : active raid5 sdc[1] sdd[2] 7813955584 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U] bitmap: 0/2 pages [0KB], 65536KB chunkThat [U_U] tells you one device is missing. Don’t reboot the box unnecessarily while it’s in this state.
Step 3: Physical swap. If your backplane supports hot-swap, pull the drive and slot the new one. If not, power down, swap, power back up. The array will reassemble automatically — the new drive just won’t be in it yet.
Step 4: Partition table. If your array is built on partitions (e.g., /dev/sdb1) rather than whole disks, copy the partition table from a healthy drive to the new one:
# Copy partition layout from sdc to the new sdesfdisk -d /dev/sdc | sfdisk /dev/sdeStep 5: Add the new drive:
mdadm /dev/md0 --add /dev/sde # whole disk# ormdadm /dev/md0 --add /dev/sde1 # partitionStep 6: Watch it rebuild:
watch -n 2 cat /proc/mdstatmd0 : active raid5 sde[3] sdc[1] sdd[2] 7813955584 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] [======>..............] recovery = 31.5% (1231872/3906977) finish=42.3min speed=62400K/secLeave it alone. Don’t run intensive I/O workloads during a rebuild if you can help it — you’re one URE away from a very bad day (see rebuild math) and adding load doesn’t help your odds.
Re-add vs add: If a drive was temporarily removed (power glitch, cable problem) and its superblock is still intact and current, use --re-add instead of --add. mdadm will recognize the superblock and do a partial sync instead of a full rebuild. Much faster.
mdadm /dev/md0 --re-add /dev/sdbGrowing Arrays
You built a 3-drive RAID 5 eighteen months ago. Your media collection has grown, you’ve got a 4th identical drive, and now you want to expand. Good news: mdadm can reshape a live array.
# Add the new drive firstmdadm /dev/md0 --add /dev/sde
# Then grow the array to use itmdadm /dev/md0 --grow --raid-devices=4What happens next is a reshape. mdadm will redistribute the stripe across four drives instead of three. This can take a very long time — think hours to days on large arrays. Check progress the usual way:
cat /proc/mdstatmd0 : active raid5 sde[4] sdd[2] sdc[1] sdb[0] 15627911168 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] [===>.................] reshape = 17.2% (671788032/3906977792) finish=1243.7min speed=43327K/sec1243 minutes. That’s 20 hours. Have a coffee. Do not reboot the machine mid-reshape — you will corrupt the array. mdadm writes a backup of the reshape progress, but you really don’t want to test that recovery path.
Once the reshape completes, expand the filesystem — the array is bigger but your filesystem doesn’t know that yet:
# ext4resize2fs /dev/md0
# xfs (must be mounted)xfs_growfs /mnt/data
# btrfsbtrfs filesystem resize max /mnt/dataConverting RAID 5 → RAID 6
If you’re running RAID 5 with 4TB+ drives and you’ve read the rebuild math article, you might be having second thoughts. RAID 6 gives you two-drive fault tolerance, which is the right call when your drives are large enough that a rebuild could take 12+ hours and hit UREs.
You can convert in-place. You’ll need an extra drive and a backup file on a safe, separate filesystem (not on the RAID array itself):
# Add the new drive (now have N+1 drives for RAID 6)mdadm /dev/md0 --add /dev/sdf
# Convert from RAID 5 (4 drives) to RAID 6 (5 drives)mdadm /dev/md0 --grow --level=6 --raid-devices=5 \ --backup-file=/tmp/md0-backup.binThe --backup-file is not optional. mdadm uses it to store the critical stripe data during the level conversion — if the process is interrupted without it, you may lose data. Store it somewhere reliable: a USB drive, a different machine over NFS, anywhere that isn’t the array being converted.
The conversion is a full reshape. Same deal as growing — it takes time, don’t reboot, don’t pull the backup file until it’s complete. You can check progress with /proc/mdstat as usual.
After completion, verify:
mdadm --detail /dev/md0 | grep -E "RAID Level|State|Active Devices" RAID Level : raid6 State : clean Active Devices : 5Periodic Scrubbing
RAID parity can drift. Bit rot happens. A write that was interrupted mid-stripe can leave your data and parity inconsistent without any visible error. This is called a “mismatch,” and it’s exactly the kind of silent problem that makes you discover your data was already corrupt when you needed it most.
The fix is scheduled scrubs:
# Trigger a check on md0echo check > /sys/block/md0/md/sync_action
# Watch it runwatch -n 5 cat /proc/mdstatmd0 : active raid6 ... [==================>..] check = 93.2% ...When it finishes, check for mismatches:
cat /sys/block/md0/md/mismatch_cntZero is what you want. Non-zero means mdadm found inconsistencies and fixed them (for RAID levels that can self-repair). You should log this and investigate if the count is large or growing.
Monthly scrubs are the standard cadence. The scrub-cron.sh script in the example repo iterates every md device on the system, triggers a check, waits for completion, logs the mismatch count, and sends an alert email if anything non-zero shows up. Drop it in /etc/cron.monthly/ and forget about it — until it emails you.
mdadm Monitor Mode
Here’s the thing that most guides skip: mdadm has a daemon mode that watches your arrays and sends email alerts when something goes wrong. It’s already installed. You just never turned it on.
Events that trigger alerts:
Fail— a drive has been marked failedFailSpare— a spare drive failed during rebuild (particularly bad)DegradedArray— array is running without full redundancyRebuildStarted/RebuildFinished— rebuild milestonesSparesMissing— array has fewer spares than configured
First, configure your email address in /etc/mdadm/mdadm.conf:
MAILADDR your-email@example.comPROGRAM /usr/share/mdadm/mdadm-cmdline-runnerThen enable monitoring. On modern Debian/Ubuntu systems, mdmonitor.service should already be present:
systemctl enable --now mdmonitor.servicesystemctl status mdmonitor.serviceIf it’s not present, the mdadm-monitor.service file in the example repo gives you a minimal unit that calls mdadm --monitor --scan --syslog. Drop it in /etc/systemd/system/, run systemctl daemon-reload, and enable it.
Test that it’s working:
mdadm --monitor --scan --test -1This does a one-shot test run and logs what events it would fire. If you see output, monitoring is working.
Recovering Arrays After a Host Reinstall
You rebuilt your server. Fresh OS. Your drives are still there with all their data, but now Linux doesn’t know about the array. This happens more often than people admit.
mdadm can usually figure it out on its own:
# Scan for arrays and assemble themmdadm --assemble --scanIf that works, great. If it doesn’t — maybe because drives have slightly different superblock states — you can try:
mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sddSpecify the device name and all member drives explicitly. mdadm will read the superblocks and reconstruct the array metadata.
Once it’s running, regenerate your mdadm.conf so this survives reboots:
mdadm --detail --scan >> /etc/mdadm/mdadm.confDon’t skip the initramfs step. On Debian/Ubuntu, the initramfs needs to know about your arrays to assemble them during boot:
update-initramfs -uOn Fedora/RHEL:
dracut --forceSkip this and you’ll boot to a degraded or missing array after the next restart.
Common Emergencies
Dirty arrays after unclean shutdown: If power died mid-write, mdadm marks the array dirty on next assembly. It’ll run a resync automatically to verify consistency. Let it finish. Don’t interrupt it.
--assemble --force: This is a last resort for when mdadm refuses to assemble because drives have different superblock event counts (often after an unclean shutdown left one drive behind). It tells mdadm to assemble using whatever drives are available, ignoring event count mismatches.
Use it carefully. If you force-assemble with a genuinely outdated drive, you may overwrite good data with old data. Only use it if you understand which drives are current.
mdadm --assemble --force /dev/md0 /dev/sdb /dev/sdc /dev/sddSuperblock corruption: If a drive’s superblock is corrupt, you’re into --zero-superblock territory and potentially manual recovery. That’s out of scope here — and honestly, that’s where “I have a backup” becomes the most important sentence you’ll say all week. RAID is not backup. You know this.
Real Talk
mdadm has been around for 25 years because it’s genuinely good. It’s boring in the best way — predictable, scriptable, transparent. Every operation has a corresponding /proc/mdstat or --detail output you can check. You can write cron jobs for it. You can build monitoring around it with standard Unix tools.
The thing that bites people isn’t mdadm itself — it’s the gap between “array created” and “operations configured.” The monitoring not turned on, the scrubs not scheduled, the rebuild math not considered when buying consumer drives. Fill those gaps now, while nothing is on fire.
Go set up the monitor service. Add the scrub cron. Check /proc/mdstat right now just to see what state your arrays are actually in.
Your 2 AM self will thank you.
Related reading:
- RAID 0/1/5 Explained — the foundation
- RAID 6 vs RAID 10 — picking the right level
- RAID Rebuild Math — why RAID 5 gets risky on large drives
- RAID 50/60 Nested — when one RAID level isn’t enough
- SMART Monitoring with smartmontools — know before a drive dies