Raid 5 on Fedora 4 working with SATA ?

Terry Barnaby terry1 at beam.ltd.uk
Wed Feb 1 12:01:06 UTC 2006


Gilboa Davara wrote:
> On Wed, 2006-02-01 at 10:36 +0000, Terry Barnaby wrote:
> 
>>Gilboa Davara wrote:
>>
>>>On Wed, 2006-02-01 at 09:45 +0000, Terry Barnaby wrote:
>>>
>>>
>>>>Hi,
>>>>
>>>>I have just set up a Raid 5 disk array using 4 SATA disks on Fedora 4.
>>>>To test the setup I unplugged the SATA cable from one of the disk drives.
>>>>I was expected the system to carry on with messages from the Raid system
>>>>indicating that there was a disk drive down.
>>>>
>>>>However the Raid 5 partition became completely inaccessable after un-plugging
>>>>the drive. The kernel reported disk errors but there was no error messages
>>>
>>>>from the Raid system and "mdadm -Q --detail /dev/md2" reported that there
>>>
>>>>was no problems with the Raid array.
>>>>
>>>>When I rebooted the system (needed a reset) the Raid system reported that
>>>>one disk was down and the partition became readable again.
>>>>
>>>>It appears that the default configuration of the Raid 5 system does not
>>>>handle a complete drive failier during up-time. I presume it may respond to
>>>>disk errors from a disk drive that is connected but once disconnected the
>>>>Raid system appears to ignore errors.
>>>>
>>>>Is there a configuration option to allow the Raid system to respond to
>>>>a completely broken drive or cable ?
>>>>
>>>>Terry
>>>>
>>>
>>>
>>>A. Are you sure your machine/controller hot plug? SATA doesn't support
>>>it by default. (You'll need special drive enclosures and
>>>hot-plug-supporting controller.
>>>B. Can you post your complete machine configuration?
>>>
>>>Gilboa
>>>
>>
>>Hi,
>>
>>Thank you for the response.
>>
>>A. No, I don't think the SATA controller is a hot-plug-supporting controller.
>>It is a: "Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA Controller (rev 04)".
>>B.
>>Motherboard:	AOPEN i915Ga-PLF
>>CPU:		Pentum 4 3GHz
>>Disks:		4 * SATA WD Caviar 320G
>>
>>Paritions:	Each disk has: 1 - 20G, 2 - 1G (swap), 3 - ~300G
>>Raid:		"/"      /dev/md0 Raid1 using /dev/sda1,/dev/sdb1
>>		"/spare" /dev/md1 Raid1 using /dev/sdc1,/dev/sdd1
>>		"/data"  /dev/md2 Raid5 using /dev/sda3,/dev/sdb3,/dev/sdc3,/dev/sdd3
>>
>>Although the SATA controller is not a "hot-plug" controller I assumed that
>>disconnecting a SATA disk to simulate a cable failier or complete drive failier
>>would cause the RAID system to react correctly. Certainly I see kernel
>>error messages from the disk/controller in question and I would have assumed that
>>the RAID system would react to this ...
>>
>>Terry
>>	
> 
> 
> By default software RAID1/5/6 support on-line drive
> kill/remove/rebuild/etc.
> However, seems that the MD driver is unaware of the dead drive.
> 
> What does /proc/mdstat say?
> 
> Gilboa
> 
> 

After removing the SATA cable on /dev/sdd, if I access a file there is a long delay
and then the program returns with no error but no data. For example:
"cat /data/test-file" will delay and then exit with status of "0" but no file
contents are displayed.

The kernel is: 2.6.14-1.1656_FC4smp: I get the following kernel messages:

Feb  1 11:51:37 library kernel: ata2: command 0x35 timeout, stat 0x0 host_stat 0x61
Feb  1 11:51:38 library sshd(pam_unix)[13027]: session opened for user root by root(uid=0)
Feb  1 11:52:07 library kernel: ata2: command 0x25 timeout, stat 0x0 host_stat 0x61
Feb  1 11:53:07 library last message repeated 2 times
Feb  1 11:54:37 library last message repeated 3 times
Feb  1 11:55:01 library crond(pam_unix)[13091]: session opened for user root by (uid=0)
Feb  1 11:55:01 library crond(pam_unix)[13091]: session closed for user root
Feb  1 11:55:07 library kernel: ata2: command 0x25 timeout, stat 0x0 host_stat 0x61

/proc/mdstat has:
Personalities : [raid1] [raid5]
md1 : active raid1 sdc1[0]
       20482752 blocks [2/1] [U_]

md2 : active raid5 sdd3[3] sdc3[2] sdb3[1] sda3[0]
       873196800 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

md0 : active raid1 sdb1[1] sda1[0]
       20482752 blocks [2/2] [UU]

unused devices: <none>

The output of "mdadm -Q --detail /dev/md2" is:
/dev/md2:
         Version : 00.90.02
   Creation Time : Tue Jan 31 14:14:07 2006
      Raid Level : raid5
      Array Size : 873196800 (832.75 GiB 894.15 GB)
     Device Size : 291065600 (277.58 GiB 298.05 GB)
    Raid Devices : 4
   Total Devices : 4
Preferred Minor : 2
     Persistence : Superblock is persistent

     Update Time : Wed Feb  1 11:51:07 2006
           State : active
  Active Devices : 4
Working Devices : 4
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 64K

            UUID : 56bd5037:9d9b9018:eb8f01d6:94155776
          Events : 0.230

     Number   Major   Minor   RaidDevice State
        0       8        3        0      active sync   /dev/sda3
        1       8       19        1      active sync   /dev/sdb3
        2       8       35        2      active sync   /dev/sdc3
        3       8       51        3      active sync   /dev/sdd3

Terry




More information about the fedora-list mailing list