[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[dm-devel] A step behind?



Ok guys, please answer me to this doubt question, or i'll wast all my remaining time
While multipath tools gives me random problems I *suppose* i could be a dm multipath target kernel problem.
I'm doing some testing now using only the dmsetup tool in order to check at low level the multipath target.
I recall my configuration:


hp pc server dl380
qlogic 2312 kernel builtin driver single path (qlport_down_retry 1)
hsg80 dual controller in multibus failover
the fabric connected to both controllers

slackware 10.0 kernel 2.6.10rc-2 udm-2
devmapper 1.00.19

i configured a unit on the hsg80 and it appears to the system as an active path /dev/sdb and a ghost path /dev/sda.

so this is my table
disk1: 0 71114623 multipath 1 queue_if_no_path 0 2 2 round-robin 0 1 1 8:0 1000 round-robin 0 1 1 8:16 1000


i begin some write operation on the disk (i have a task syncing every 1 seconds in order to stress the disk).
When i fail manually the active path (for example restart the controller having it online) dmsetup status reports flag "F" for every path.
I think it is normal, becouse hsg80 is not so fast in order to pass the unit online to the remaining controller.
So when kernel try the alternate path it founds it is down (and fails it).
Because the presence of queue_if_no_path the ouput will be queued and my process is not distrupted.
I can see the growing queue whit dmsetup status disk1, but after some seconds the sync/writing process goes in D status, so is it normal or is simple a limit to the queuing?
I begin to do a dmsetup message disk1 0 reinstate_path 8:0, and 8:16 alternatively and randomly (yes i think i can also reinstate a failed path and i'm aspecting the target retries and refails again, if it is not correct i think no multipath tools will be useful better then my manual commands)
After some seconds i can see the queue shrinking, when it reaches 0, the sync/writing process wake up and all continue normally.
This howewer is NOT the normal behaviour, and after some testing (randomly but never more then 10/20) i got the process distruption.
I want to know if is it correct for me (or am i mad) assuming queue_if_no_path will never disrupt a process!
If it is so, (and i think really it is), is unuseful for me passing nights stracing multipath tools and reading a lot of sources!


In this scenario i'd like to have a good interpretation of kernel messages too:

*) scsi error
*) lost page error
*) incorrect number of segments

and so on

Please help me!

Regards

Nicola Ranaldo


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]