Normal RAID situation

Check RAID status:

# mdadm --detail /dev/md0

Where /dev/md0 is the path to the volume. Example results:

/dev/md0:
           Version : 1.2
     Creation Time : Sun Dec 27 00:33:54 2020
        Raid Level : raid5
        Array Size : 9766912000 (9314.45 GiB 10001.32 GB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sun Dec 11 17:19:25 2022
             State : active
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : kn0:0  (local to host kn0)
              UUID : 00000000:00000000:00000000:00000000
            Events : 22001356

    Number   Major   Minor   RaidDevice State
       0       8       48        0      active sync   /dev/sdd
       7       8       80        1      active sync   /dev/sdf
       8       8       16        2      active sync   /dev/sdb
       3       8        0        3      active sync   /dev/sda
       4       8       32        4      active sync   /dev/sdc
       6       8       64        5      active sync   /dev/sde

Check for RAID status:

cat /proc/mdstat

Example output:

Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[8] sdf[7] sde[6] sda[3] sdd[0] sdc[4]
      9766912000 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6] [UUUUUU]
      bitmap: 11/15 pages [44KB], 65536KB chunk

unused devices: <none>

Dealing with a faulty disk

Check RAID status:

# mdadm --detail /dev/md0

Example output

/dev/md0:                                                     
           Version : 1.2                                      
     Creation Time : Sun Dec 27 00:33:54 2020
        Raid Level : raid5                       
        Array Size : 9766912000 (9314.45 GiB 10001.32 GB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 6       
     Total Devices : 6                                   
       Persistence : Superblock is persistent           
                                                              
     Intent Bitmap : Internal                  
                                                              
       Update Time : Fri Dec  9 22:03:24 2022                 
             State : active, degraded                         
    Active Devices : 5                                        
   Working Devices : 5                       
    Failed Devices : 1                                   
     Spare Devices : 0                                  
                             
            Layout : left-symmetric            
        Chunk Size : 512K                               
                                                              
Consistency Policy : bitmap                                        
                                                              
              Name : kn0:0  (local to host kn0)               
              UUID : 00000000:00000000:00000000:00000000
            Events : 21976938                                 
                                                              
    Number   Major   Minor   RaidDevice State                                        
       0       8       48        0      active sync   /dev/sdd
       -       0        0        1      removed                                  
       2       8       16        2      active sync   /dev/sdb                                       
       3       8        0        3      active sync   /dev/sda
       4       8       32        4      active sync   /dev/sdc
       6       8       64        5      active sync   /dev/sde
                                                              
       7       8       80        -      faulty   /dev/sdf

The disk can be replaced (potentially you can remove it first, but this worked too). When your new disk shows up in /dev/sd? you can add it with mdadm e.g.:

mdadm --manage /dev/md0 --add /dev/sdf

Afterwards the volume status shows as:

/dev/md0:
           Version : 1.2
     Creation Time : Sun Dec 27 00:33:54 2020
        Raid Level : raid5
        Array Size : 9766912000 (9314.45 GiB 10001.32 GB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Dec  9 22:07:58 2022
             State : active, degraded, recovering
    Active Devices : 5
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 1

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

    Rebuild Status : 0% complete

              Name : kn0:0  (local to host kn0)
              UUID : 00000000:00000000:00000000:00000000
            Events : 21977168

    Number   Major   Minor   RaidDevice State
       0       8       48        0      active sync   /dev/sdd
       7       8       80        1      spare rebuilding   /dev/sdf
       2       8       16        2      active sync   /dev/sdb
       3       8        0        3      active sync   /dev/sda
       4       8       32        4      active sync   /dev/sdc
       6       8       64        5      active sync   /dev/sde

And the progress can be monitored with:

# cat /proc/mdstat

Showing something like:

Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdf[7] sde[6] sda[3] sdb[2] sdd[0] sdc[4]
      9766912000 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/5] [U_UUUU]
      [>....................]  recovery =  0.0% (1841840/1953382400) finish=547.4min speed=59414K/sec
      bitmap: 15/15 pages [60KB], 65536KB chunk

unused devices: <none>

Replacing a disk for a new disk

When you know which disk to replace, first make the disk faulty:

# mdadm --manage /dev/md0 --fail /dev/sdb

Returning:

mdadm: set /dev/sdb faulty in /dev/md0

Which can be seen in the volume details by running:

# mdadm --detail /dev/md0                              

Returning:

/dev/md0:                                                     
           Version : 1.2                                                         
     Creation Time : Sun Dec 27 00:33:54 2020                                                           
        Raid Level : raid5                                    
        Array Size : 9766912000 (9314.45 GiB 10001.32 GB)     
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)      
      Raid Devices : 6                                        
     Total Devices : 6
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sun Dec 11 01:53:51 2022
             State : active, degraded
    Active Devices : 5
   Working Devices : 5
    Failed Devices : 1
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : kn0:0  (local to host kn0)
              UUID : 00000000:00000000:00000000:00000000
            Events : 21990345

    Number   Major   Minor   RaidDevice State
       0       8       48        0      active sync   /dev/sdd
       7       8       80        1      active sync   /dev/sdf
       -       0        0        2      removed
       3       8        0        3      active sync   /dev/sda
       4       8       32        4      active sync   /dev/sdc
       6       8       64        5      active sync   /dev/sde

       2       8       16        -      faulty   /dev/sdb

Then remove the disk from the volume:

# mdadm --manage /dev/md0 --remove /dev/sdb

Returning:

mdadm: hot removed /dev/sdb from /dev/md0

And can be seen in the volume overview with running:

# mdadm --detail /dev/md0

Showing:

/dev/md0:
           Version : 1.2
     Creation Time : Sun Dec 27 00:33:54 2020
        Raid Level : raid5
        Array Size : 9766912000 (9314.45 GiB 10001.32 GB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 6
     Total Devices : 5
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sun Dec 11 01:54:15 2022
             State : active, degraded
    Active Devices : 5
   Working Devices : 5
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : kn0:0  (local to host kn0)
              UUID : 00000000:00000000:00000000:00000000
            Events : 21990348

    Number   Major   Minor   RaidDevice State
       0       8       48        0      active sync   /dev/sdd
       7       8       80        1      active sync   /dev/sdf
       -       0        0        2      removed
       3       8        0        3      active sync   /dev/sda
       4       8       32        4      active sync   /dev/sdc
       6       8       64        5      active sync   /dev/sde

After adding the new disk to the system and finding it, add it to the volume with:

# mdadm --manage /dev/md0 --add /dev/sdb

Showing:

mdadm: added /dev/sdb

You can see that it is repairing the raid after adding it to the volume with running:

# mdadm --detail /dev/md0

Showing:

/dev/md0:
           Version : 1.2
     Creation Time : Sun Dec 27 00:33:54 2020
        Raid Level : raid5
        Array Size : 9766912000 (9314.45 GiB 10001.32 GB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sun Dec 11 01:55:44 2022
             State : active, degraded, recovering
    Active Devices : 5
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 1

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

    Rebuild Status : 0% complete

              Name : kn0:0  (local to host kn0)
              UUID : 00000000:00000000:00000000:00000000
            Events : 21990366

    Number   Major   Minor   RaidDevice State
       0       8       48        0      active sync   /dev/sdd
       7       8       80        1      active sync   /dev/sdf
       8       8       16        2      spare rebuilding   /dev/sdb
       3       8        0        3      active sync   /dev/sda
       4       8       32        4      active sync   /dev/sdc
       6       8       64        5      active sync   /dev/sde

Progress can be monitored by running:

# cat /proc/mdstat

Showing the progress in precentage, estimated time, and current speed:

Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10]
md0 : active raid5 sdb[8] sdf[7] sde[6] sda[3] sdd[0] sdc[4]
      9766912000 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/5] [UU_UUU]
      [>....................]  recovery =  0.0% (1033984/1953382400) finish=597.9min speed=54420K/sec
      bitmap: 14/15 pages [56KB], 65536KB chunk

unused devices: <none>

Bonus, figuring out which disk is which

Simple trick that can be used when you don’t know which disk is represented by which name in the file system, and you have disk activity lights per disk on your system. You can use dd to read lots of data quickly from the disk which should be shown as activity :) e.g.:

dd if=/dev/sde of=/dev/null

After stopping the command (ctrl+c), you should see something like:

^C4512737+0 records in
4512736+0 records out
2310520832 bytes (2.3 GB, 2.2 GiB) copied, 15.246 s, 152 MB/s