1 Reply Latest reply on Sep 27, 2013 5:11 AM by feickholt

    Attention! After Replacing a faulty HD this device might not be included in the raid! (WS5500)

    feickholt

      Hi!

      We would like to inform you about a Harddrive problem we've had serveral times during the last month.

       

      Since last October we use 14 WS5500 for about 60000 users.

       

      During last last month serveral HDs went broken (>10).

       

      We replaced the HDs after receiving new devices from MC.

      Since HDs are hot plugable we expected no problems by replacing the wrong HD.

       

      The HDs comes up, but the HDs were not be included into the RAID.

      There is no failure indicator to show this, so you might think everythink works fine.

       

      Using the following command:

      /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 | egrep "(Slot|Firmware state)"

       

      you can verify the RAID Status:

       

      This is what we get:

      Slot Number: 0

      Firmware state: Online, Spun Up

      Slot Number: 1

      Firmware state: Online, Spun Up

      Slot Number: 2

      Firmware state: Online, Spun Up

      Slot Number: 3

      Firmware state: Online, Spun Up

      Slot Number: 4

      Firmware state: Online, Spun Up

      Slot Number: 5

      Firmware state: Unconfigured(good), Spun Up

       

      As you can see the device was found and but is not configured into the raid!

       

      Since there is no other failure indicator (Log messages, GUI). You might think everything works fine.

      If you've ever changed a HD please check your installation using this command.


      We opened a SR relating this issue for a long time!

      Until now there is no solution to solve this problem.

       

      We like to know if there are any other customer outside having the same problem. (HD failures and Raid problems)

       

      Thanks

       

      Frank

        • 1. Re: Attention! After Replacing a faulty HD this device might not be included in the raid! (WS5500)
          feickholt

          Here is short solution to bring the device back ins raid:

           

          # check out the slot ena enclosureDevice

          /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 | egrep "(Slot|Firmware state|Encl)"

          EnclosureDevice ID: 0

          SlotNumber: 0

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 1

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 2

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 3

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 4

          Firmwarestate: Online, Spun Up

          Enclosure Device ID: 0

          Slot Number: 5

          Firmware state: Unconfigured(good), Spun Up


           

          # find out the missing device

          /opt/MegaRAID/CmdTool2/CmdTool2 -PDGetMissing -a0

              Adapter 0 - Missing Physical drives

              No.  Array   Row   Size Expected

              0       0            1        285148 MB

           

          # Bring device back in array

          #/opt/MegaRAID/CmdTool2/CmdTool2-PdReplaceMissing -PhysDrv[<EnclosureDevice ID: slotnumber] array<Array> –row<ROW> -a<no>

           

          /opt/MegaRAID/CmdTool2/CmdTool2 -PdReplaceMissing -PhysDrv[0:5] –array0 -row1 -a0

          Adapter: 0:Missing PD at Array 0, Row 1 is replaced.

           

          # now rebuild the array

          /opt/MegaRAID/CmdTool2/CmdTool2 -PDRbld -Start -PhysDrv[0:5] -a0

          Started rebuild progress on device(Encl-0 Slot-5)

           

          # CheckRebuild

          /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 | egrep "(Slot|Firmware state|Encl)"

          EnclosureDevice ID: 0

          SlotNumber: 0

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 1

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 2

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 3

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 4

          Firmwarestate: Online, Spun Up

          Enclosure Device ID: 0

          Slot Number: 5

          Firmware state: Rebuild

           

          # wait……. a few hours

          /opt/MegaRAID/CmdTool2/CmdTool2 -pdlist -a0 | egrep "(Slot|Firmware state|Encl)"

          EnclosureDevice ID: 0

          SlotNumber: 0

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 1

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 2

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 3

          Firmwarestate: Online, Spun Up

          EnclosureDevice ID: 0

          SlotNumber: 4

          Firmwarestate: Online, Spun Up

          Enclosure Device ID: 0

          Slot Number: 5

          Firmware state: Online, Spun Up

           

           

          ------------------------------------

          REMARK:

          In case of EnclosureDevice ID: N/A you can easily omit the EnclusureDevice:

          example;: -PhysDrv[:5]


           

          Nachricht geändert durch feickholt on 27.09.13 05:11:34 CDT