Skip to content
Go back

mdadm Day-2: Grow, Replace, Scrub

By SumGuy 10 min read
mdadm Day-2: Grow, Replace, Scrub

The Article You’ll Wish You Read Before the Drive Failed

Every mdadm guide on the internet covers mdadm --create. You pick your drives, choose a RAID level, fire the command, and feel like you’ve done something. You have! You’ve also set a timer. Six months from now, probably at 2 AM, one of those drives is going to start acting up and you’re going to discover that creating the array was the easy part.

This is the operations guide. The part that covers what happens after the honeymoon — drive replacements, growing the array when you inevitably run out of space, converting RAID levels when you realize you bought drives large enough that RAID 5’s rebuild math makes you nervous, and the monitoring you forgot to enable on day one.

If you’re new to RAID concepts altogether, start with RAID 0/1/5 explained and RAID 6 vs RAID 10 first. If you’re deep in nested RAID territory, RAID 50/60 has you covered. This article assumes you already have a running array and something has gone sideways.

Full example: Clone the working files at github.com/KingPin/sumguy-examples/linux/mdadm-day-2-operations


Replacing a Failing Drive

The ideal scenario is that SMART tells you a drive is dying before it actually dies. If you’re running smartmontools, you’ll see reallocated sector counts climbing, pending sectors accumulating, or uncorrectable errors ticking up. The drive is still functional — it’s just whispering “replace me soon” in the universal language of slowly degrading storage.

This is the moment to act, not when the array goes degraded.

Step 1: Mark the drive as failed (even if it’s still technically spinning):

Terminal window
# Identify which drive is the problem
cat /proc/mdstat
mdadm --detail /dev/md0
# Mark it failed so mdadm stops using it
mdadm /dev/md0 --fail /dev/sdb

Step 2: Remove it from the array:

Terminal window
mdadm /dev/md0 --remove /dev/sdb

At this point your array is degraded but still running. Check it:

Terminal window
cat /proc/mdstat
md0 : active raid5 sdc[1] sdd[2]
7813955584 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
bitmap: 0/2 pages [0KB], 65536KB chunk

That [U_U] tells you one device is missing. Don’t reboot the box unnecessarily while it’s in this state.

Step 3: Physical swap. If your backplane supports hot-swap, pull the drive and slot the new one. If not, power down, swap, power back up. The array will reassemble automatically — the new drive just won’t be in it yet.

Step 4: Partition table. If your array is built on partitions (e.g., /dev/sdb1) rather than whole disks, copy the partition table from a healthy drive to the new one:

Terminal window
# Copy partition layout from sdc to the new sde
sfdisk -d /dev/sdc | sfdisk /dev/sde

Step 5: Add the new drive:

Terminal window
mdadm /dev/md0 --add /dev/sde # whole disk
# or
mdadm /dev/md0 --add /dev/sde1 # partition

Step 6: Watch it rebuild:

Terminal window
watch -n 2 cat /proc/mdstat
md0 : active raid5 sde[3] sdc[1] sdd[2]
7813955584 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
[======>..............] recovery = 31.5% (1231872/3906977) finish=42.3min speed=62400K/sec

Leave it alone. Don’t run intensive I/O workloads during a rebuild if you can help it — you’re one URE away from a very bad day (see rebuild math) and adding load doesn’t help your odds.

Re-add vs add: If a drive was temporarily removed (power glitch, cable problem) and its superblock is still intact and current, use --re-add instead of --add. mdadm will recognize the superblock and do a partial sync instead of a full rebuild. Much faster.

Terminal window
mdadm /dev/md0 --re-add /dev/sdb

Growing Arrays

You built a 3-drive RAID 5 eighteen months ago. Your media collection has grown, you’ve got a 4th identical drive, and now you want to expand. Good news: mdadm can reshape a live array.

Terminal window
# Add the new drive first
mdadm /dev/md0 --add /dev/sde
# Then grow the array to use it
mdadm /dev/md0 --grow --raid-devices=4

What happens next is a reshape. mdadm will redistribute the stripe across four drives instead of three. This can take a very long time — think hours to days on large arrays. Check progress the usual way:

Terminal window
cat /proc/mdstat
md0 : active raid5 sde[4] sdd[2] sdc[1] sdb[0]
15627911168 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
[===>.................] reshape = 17.2% (671788032/3906977792) finish=1243.7min speed=43327K/sec

1243 minutes. That’s 20 hours. Have a coffee. Do not reboot the machine mid-reshape — you will corrupt the array. mdadm writes a backup of the reshape progress, but you really don’t want to test that recovery path.

Once the reshape completes, expand the filesystem — the array is bigger but your filesystem doesn’t know that yet:

Terminal window
# ext4
resize2fs /dev/md0
# xfs (must be mounted)
xfs_growfs /mnt/data
# btrfs
btrfs filesystem resize max /mnt/data

Converting RAID 5 → RAID 6

If you’re running RAID 5 with 4TB+ drives and you’ve read the rebuild math article, you might be having second thoughts. RAID 6 gives you two-drive fault tolerance, which is the right call when your drives are large enough that a rebuild could take 12+ hours and hit UREs.

You can convert in-place. You’ll need an extra drive and a backup file on a safe, separate filesystem (not on the RAID array itself):

Terminal window
# Add the new drive (now have N+1 drives for RAID 6)
mdadm /dev/md0 --add /dev/sdf
# Convert from RAID 5 (4 drives) to RAID 6 (5 drives)
mdadm /dev/md0 --grow --level=6 --raid-devices=5 \
--backup-file=/tmp/md0-backup.bin

The --backup-file is not optional. mdadm uses it to store the critical stripe data during the level conversion — if the process is interrupted without it, you may lose data. Store it somewhere reliable: a USB drive, a different machine over NFS, anywhere that isn’t the array being converted.

The conversion is a full reshape. Same deal as growing — it takes time, don’t reboot, don’t pull the backup file until it’s complete. You can check progress with /proc/mdstat as usual.

After completion, verify:

Terminal window
mdadm --detail /dev/md0 | grep -E "RAID Level|State|Active Devices"
RAID Level : raid6
State : clean
Active Devices : 5

Periodic Scrubbing

RAID parity can drift. Bit rot happens. A write that was interrupted mid-stripe can leave your data and parity inconsistent without any visible error. This is called a “mismatch,” and it’s exactly the kind of silent problem that makes you discover your data was already corrupt when you needed it most.

The fix is scheduled scrubs:

Terminal window
# Trigger a check on md0
echo check > /sys/block/md0/md/sync_action
# Watch it run
watch -n 5 cat /proc/mdstat
md0 : active raid6 ...
[==================>..] check = 93.2% ...

When it finishes, check for mismatches:

Terminal window
cat /sys/block/md0/md/mismatch_cnt

Zero is what you want. Non-zero means mdadm found inconsistencies and fixed them (for RAID levels that can self-repair). You should log this and investigate if the count is large or growing.

Monthly scrubs are the standard cadence. The scrub-cron.sh script in the example repo iterates every md device on the system, triggers a check, waits for completion, logs the mismatch count, and sends an alert email if anything non-zero shows up. Drop it in /etc/cron.monthly/ and forget about it — until it emails you.


mdadm Monitor Mode

Here’s the thing that most guides skip: mdadm has a daemon mode that watches your arrays and sends email alerts when something goes wrong. It’s already installed. You just never turned it on.

Events that trigger alerts:

First, configure your email address in /etc/mdadm/mdadm.conf:

MAILADDR your-email@example.com
PROGRAM /usr/share/mdadm/mdadm-cmdline-runner

Then enable monitoring. On modern Debian/Ubuntu systems, mdmonitor.service should already be present:

Terminal window
systemctl enable --now mdmonitor.service
systemctl status mdmonitor.service

If it’s not present, the mdadm-monitor.service file in the example repo gives you a minimal unit that calls mdadm --monitor --scan --syslog. Drop it in /etc/systemd/system/, run systemctl daemon-reload, and enable it.

Test that it’s working:

Terminal window
mdadm --monitor --scan --test -1

This does a one-shot test run and logs what events it would fire. If you see output, monitoring is working.


Recovering Arrays After a Host Reinstall

You rebuilt your server. Fresh OS. Your drives are still there with all their data, but now Linux doesn’t know about the array. This happens more often than people admit.

mdadm can usually figure it out on its own:

Terminal window
# Scan for arrays and assemble them
mdadm --assemble --scan

If that works, great. If it doesn’t — maybe because drives have slightly different superblock states — you can try:

Terminal window
mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sdd

Specify the device name and all member drives explicitly. mdadm will read the superblocks and reconstruct the array metadata.

Once it’s running, regenerate your mdadm.conf so this survives reboots:

Terminal window
mdadm --detail --scan >> /etc/mdadm/mdadm.conf

Don’t skip the initramfs step. On Debian/Ubuntu, the initramfs needs to know about your arrays to assemble them during boot:

Terminal window
update-initramfs -u

On Fedora/RHEL:

Terminal window
dracut --force

Skip this and you’ll boot to a degraded or missing array after the next restart.


Common Emergencies

Dirty arrays after unclean shutdown: If power died mid-write, mdadm marks the array dirty on next assembly. It’ll run a resync automatically to verify consistency. Let it finish. Don’t interrupt it.

--assemble --force: This is a last resort for when mdadm refuses to assemble because drives have different superblock event counts (often after an unclean shutdown left one drive behind). It tells mdadm to assemble using whatever drives are available, ignoring event count mismatches.

Use it carefully. If you force-assemble with a genuinely outdated drive, you may overwrite good data with old data. Only use it if you understand which drives are current.

Terminal window
mdadm --assemble --force /dev/md0 /dev/sdb /dev/sdc /dev/sdd

Superblock corruption: If a drive’s superblock is corrupt, you’re into --zero-superblock territory and potentially manual recovery. That’s out of scope here — and honestly, that’s where “I have a backup” becomes the most important sentence you’ll say all week. RAID is not backup. You know this.


Real Talk

mdadm has been around for 25 years because it’s genuinely good. It’s boring in the best way — predictable, scriptable, transparent. Every operation has a corresponding /proc/mdstat or --detail output you can check. You can write cron jobs for it. You can build monitoring around it with standard Unix tools.

The thing that bites people isn’t mdadm itself — it’s the gap between “array created” and “operations configured.” The monitoring not turned on, the scrubs not scheduled, the rebuild math not considered when buying consumer drives. Fill those gaps now, while nothing is on fire.

Go set up the monitor service. Add the scrub cron. Check /proc/mdstat right now just to see what state your arrays are actually in.

Your 2 AM self will thank you.


Related reading:


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
Garden vs Tilt vs Skaffold

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts