How to replace a failed harddrive in a software RAID 1 array?

Is it possible to replace a faulty drive from RAID 1? What are the steps?

Here I’m explaining the detailed steps in replacing a bad drive from software RAID 1 array. As you know RAID 1 means mirroring. Here I’ve two hard drives /dev/sda and /dev/sdd with partitions /dev/sda1, /dev/sda2, /dev/sda3, /dev/sda5, /dev/sda6, /dev/sda7 and /dev/sda8 as well as /dev/sdd1, /dev/sdd2, /dev/sdd3, /dev/sdd5, /dev/sdd6, /dev/sdd7 and /dev/ssd8.

This is how RAID array is built:

/dev/sda1 and /dev/sdd1 makes the /dev/md0 RAID 1 array
/dev/sda2 and /dev/sdd2 makes the /dev/md3 RAID 1 array
/dev/sda3 and /dev/sdd3 makes the /dev/md5 RAID 1 array
/dev/sda5 and /dev/sdd5 makes the /dev/md4 RAID 1 array
/dev/sda6 and /dev/sdd6 makes the /dev/md2 RAID 1 array
/dev/sda7 and /dev/sdd7 makes the /dev/md1 RAID 1 array
/dev/sda8 and /dev/sdd8 makes the /dev/md6 RAID 1 array

This can be identified from the following command:

# cat /proc/mdstat

Here the failing disk is /dev/sdd and we need to replace it. From the command cat /proc/mdstat we can also get the details on degrading array. Here’s an example:

[email protected] [~]# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdd1[1] sda1[0]
      305088 blocks [2/2] [UU]
md3 : active raid1 sdd2[2](F) sda2[0]
      57673280 blocks [2/1] [U_]
md4 : active raid1 sdd5[1] sda5[0]
      26217984 blocks [2/2] [UU]
md2 : active raid1 sdd6[1] sda6[0]
      8385792 blocks [2/2] [UU]
md1 : active raid1 sdd7[1] sda7[0]
      4192832 blocks [2/2] [UU]
md6 : active raid1 sdd8[2](F) sda8[0]
      1830518272 blocks [2/1] [U_]
md5 : active raid1 sdd3[1] sda3[0]
      26217984 blocks [2/2] [UU]
unused devices: 


Instead of UU if you see ‘_‘(underscore), it’s a degrading drive. Here in this given example though ‘_’ is in second position, you can see a ‘F’ besides sdd2 and sdd8 so we can confirm that /dev/sdd is failing. You can also initiate a smartctl for /dev/sdd to confirm it. Check for ATA errors in the smartctl output.

Here /dev/sdd2 and /dev/sdd8 is failed. We need to mark the drive as failed for other arrays as well and then need to remove it from the RAID arrays.

Marking the harddrive as failed and removing it

Here’s the command to mark the drive as failed:

# mdadm --manage /dev/md0 --fail /dev/sdd1

Similarly, do it for other drives as well.


Here’s a sample output after executing it for other RAID arrays:


Removing the drive

To remove the failed drives from the RAID array, please use the following command:

# mdadm --manage /dev/md0 --remove /dev/sdd1

Repeat it for other drives. Here’s a sample screen-shot obtaining its output:


Once the bad drive is removed from the RAID array it’ll display only one harddrive, you can see it from cat /proc/mdstat


Now it’s time to power off the server and contact your DC for a drive replacement. To power off:

#shutdown -h now

Replace the defective /dev/sdd with a new one 🙂 It should be in exact size with that of the old one. (That is, if old drive is 1TB then the new one should also be 1TB)

Once the defective drive is replaced boot up the server. Now we need to create partitions on the new drive with the exact replica of the other drive /devc/sda as it is RAID1. For that we can use the command sfdisk.

# sfdisk -d /dev/sda | sfdisk /dev/sdd

Here, the entire partitions on /dev/sda will be copied over to the new one – /dev/sdd


Now you can execute the following command to check whether both the harddrives have the same partitions:

# fdisk -l

Add drives to the RAID array

Next is, we need to add the new partitions to the RAID arrays, for that we use the following command:

# mdadm --manage /dev/md0 --add /dev/sdd1

Repeat it for other RAID arrays as well.


Once you finished adding drives to the RAID arrays, it’ll start synchronising automatically.


That’s it, now you’ve replaced /dev/sdd!

Any questions? post a comment!!

Heba Habeeb

Working as a Linux Server Admin, Infopark, Cochin, Kerala.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *