Adding a Hard Drive back into RAID on a Web Gateway 5000 or 5500 Intel based Appliance


     

    Prerequsites

     

    The content below only applies to the 5000 and 5500 Web Gateway Intel platform appliances.

     

    Background Info

     

    The RAID Controller on Intel appliances is designed to prevent reinsertion of a hard drive which has previously failed.  If a hard drive is removed from the RAID array and then the same drive is reinserted (reseated), the drive will be marked 'Unconfigured (bad)' and the RAID rebuilding process will not start.  This behavior is by design to prevent reusing a bad hard disk drive. 

     

    It can also happen that a replaced drive (new drive) is not automatically detected and integrated into the RAID. In this case the status would show "Unconfigured (good)".

     

    In both cases the overall RAID state will show "Degraded" as not all drives are active as part of the RAID array.

     

    Purpose

     

    The purpose of this article is to provide instructions for re-integrating a physical hard drive into an existing RAID array. This can be achieved either from the command line of your Web Gateway or via the RAID BIOS console (appliance needs to be taken offline for this method).

     

    Checking RAID status from command-line

     

    1. Log into the Web Gateway appliance with an SSH client using the ‘root’ user account.

     

    2. Enter the following commands to see the physical drive status and the state of the RAID array: 

     

    Run the following command to see the state of the RAID array:

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -ldpdinfo -a0 -nolog | egrep "(State)"

     

    degraded.jpg

     

    Note that the RAID status shows "Degraded" because there is at least one drive missing

     

    Run the following command to see the status of the missing physical drive

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 -nolog | egrep "(Slot|Firmware state|Encl)"

     

    The output of the command should resemble the example output below.  In the example below, note that Slot Number 5 reports an 'Unconfigured(bad)' status, indicating that this drive was likely previously removed and reinserted :

     

    unconfigured-bad.jpg

     

     

     

     

     

     

     

    Here are other potential status outputs:

     

    Unconfigured(bad)-- As described above, this status indicates a drive that was removed and then reinserted (reseated) and therefore no longer part of the RAID array

     

    Unconfigured(good)--This status indicates a new (replacement) drive was inserted but isn't yet part of the RAID array

     

    Failed-- This staus indicates a drive that has failed and is no longer usable.

     

     

     

     

     

     

     

    The remainder of the article will focus on adding drives with status 'Unconfigured(bad)' or 'Unconfigured(good)' back into the RAID array. For Failed drives, please make sure to contact technical support for a replacement.

     

     

    Adding a drive back into the RAID array

     

     

    As mentioned above, manually adding a drive back into the RAID array is a step that is needed if the status of that drive is 'Unconfigured(bad)' or 'Unconfigured(good)'.  The steps below will walk you through adding a drive back into the RAID array via command-line.

     

    This procedure can be performed without taking the appliance offline or rebooting. Keep in mind though that rebuilding a RAID can take some time and during the rebuild performance can be impacted due to the high number of I/O operations.

     

    1. Log into the Web Gateway appliance with an SSH client using the ‘root’ user account.

     

    2. Show state of physical drives

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 -nolog | egrep "(Slot|Firmware state|Encl)"

     

    Note the "Enclosure ID" and "Slot Number". These numbers are used in the commands going forward.

     

    Example Unconfigured (good):

    unconfigured-good-marked2.jpg

     

    Example Unconfigured (bad):

    unconfigured-bad.jpg

     

     

    3. Optional. Only needed if the status from step #2 shows "Unconfigured(bad)"

     

    Use the command below to mark the "Unconfigured (bad)" drive as "Unconfigured (good)" for further processing

     

    Enter the Enclosure ID (X)  and Slot number (Y) from step 2

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -PDMakeGood -PhysDrv[X:Y] -a0 -nolog


    Example:

    make-good2.jpg

     

     

     

    4. Find the drive missing in the logical RAID config with the following command

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -PDGetMissing -a0 -nolog

     

    Note the "Array" and "Row" number. These numbers will be used in the commands going forward


    Example:

    get-missing2.jpg

     

     

     

     

    5. Replace the missing RAID drive with the "Unconfigured (good)" drive

     

    Enter the Enclosure ID (X)  and Slot number (Y) from step 2

    Enter the Array number (A) and row number (B) from step 4

     

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -PdReplaceMissing -PhysDrv[X:Y] -arrayA -rowB -a0 -nolog


     

    Example:

    replace-missing2.jpg

     

     

    6. Start the rebuilding process to sync the new drive with the exisiting RAID

     

    Enter the Enclosure ID (X)  and Slot number (Y) from step 2

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -PDRbld -Start -PhysDrv[X:Y] -a0 -nolog


    Example:

    rebuild2.jpg

     

     

    7. The rebuild process can take quite some time (several hours depending on the size of your RAID). To see progress you can run the command below from time to time

     

    Enter the Enclosure ID (X)  and Slot number (Y) from step 2

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -PDRbld -ShowProg -PhysDrv [X:Y] -a0 -nolog

     

    Example:

    progress2.jpg

     

     

    8. Once the rebuild has finished, all drives should show up as "Online, Spun Up"

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 -nolog | egrep "(Slot|Firmware state|Encl)"


    Example:

    spun-up.jpg

     

     

    9. The RAID state should show "Optimal"

     

    /opt/MegaRAID/CmdTool2/CmdTool2 -ldpdinfo -a0 -nolog | egrep "(State)"

     

    Example:

    optimal.jpg

     

     

     

     

    Via booting into RAID configuration utility

     

     

    1.   Reboot the system to hit the RAID menu.  On the screen below, hit "C" to go into the configuration utility for the RAID controller.

    enter-bios.jpg

     

     

    2.  After you load into the configuration, you will have to select the RAID controller in question and hit "START" to configure that RAID device.

    select-controller.jpg

     

     

    3.  After the initial screen loads for the controller you will see the following which shows the RAID in a "Degraded" state with "PD# Missing"

    show-state.jpg

     

     

    4.  Go into the "Drives" part of the RAID menu

    select-drives.jpg

     

     

    5.  After going into this location, you will see that the drive is in a "Unconfirmed BAD" state.  Please highlight the drive, select "Properties" and then "Go"

    unconfigured-drives.jpg

     

     

    6.  Since the drive is in a "Unconfigured BAD" state, we will need to change this to "Unconfigured Good". Skip this step if your drive is showing "Unconfigured Good" already.

    mark-good.jpg

     

     

    7.  Next select "Replace Missing PD" and hit "Go" again. This will add the drive back into the RAID array.

    replace-missing.jpg

     

     

     

    8.  Select "Rebuild" and "GO" to start the rebuild process of the RAID

    rebuild.jpg

     

     

     

    9.  The rebuild can take quite a bit of time. Progress can be monitored on the screen.

    progress.jpg

     

     

     

    10. Once the rebuild is done, hit the "Home" button and verify the raid status shows "Optimal"

     

    rebuild-done.jpg

     

     

     

    optimal.jpg