[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] LVM hang with some uses of raw i/o

  I have been using and experimenting with LVM on linux with a product from my
employer.  I have had excellent results on a few different systems, but am
having difficulty with one now that is prompting me to post looking for help and
  At the moment, I am using the LVM 1.1rc2 code, having upgraded from the 1.0.4
code when it was failing. The reason I went to the 1.1rc2 code was this is an
SMP system and I noticed SMP related fixes mentioned in the changelog.
  The hardware setup is an IBM x232 server with two 1.3Mhz cpus.  No raid
adapter. Four scsi drives on the scsi bus.  I took the last two drives, one 75G
and one 36G, put one large partition on each over all the space and set the
partition type to '8e'x. I then created two different volume groups, one each
containing each of those two drives. (Yes, I know about all of the many other
ways that I could define it multiple drives in one group. I did it this way for
a reason.)  I then created a set of logical volumes 2.8G in size on each of the
two volume groups. 
  The Linux setup is a Redhat 2.4.17 kernel.  I have used this same kernel tree
on a few other systems with LVM and our product with no difficulty.  We need to
use raw i/o with our product as can concurrently do tons of I/O to up to 255
slices (or logical volumes using LVM) and we use raw i/o on the other unix
  Using the Suse whitepaper's suggestion, I automated one of our boot time
startup scripts to create a set of /dev/raw devices with my preferred names and
issue the raw command to associate the raw devices to the logical volumes. Since
Redhat already has a /dev/raw/raw1 thru /dev/raw/raw128, I decided to create
mine starting with minor 129.       

This all appears to work well as I end up with (one example)
crw-rw----    1 myowner   mygroup   162, 129 Jun 26 10:36 33903c0
raw -q /dev/raw/33903c0
/dev/raw/raw129:        bound to major 58, minor 16

and in /var/log/messages I see the timestamped messages like this:
/dev/raw/raw129:^Ibound to major 58, minor 16

I can use one of our utilities to prepare (format) the logical volume for use by
product by specifying "/dev/raw/33903c0"  with no difficulty and similarly for
all of the other logical volume names via their respective /dev/raw
specification.  Our product also seems to run ok with all of the /dev/raw/xxx
devices, too.

Here comes the problem description.

Two of our other utilties which will backup (read from) or restore (write to)
data to the /dev/raw devices cause Linux to hang (must reset or power off to
recover).  I have tried to do an strace of them, but it hangs the system before
any trace data has been written.  In an attempt to see if dd caused the same
problem, I used another system (laptop) with LVM and our product and strace'd
our utility to see what blocksize it was using for reads from the raw device.  I
then straced a dd bs=875520 if=/dev/raw/33903c0 of=/dev/null.  I ctrl-c killed
it after about 200 records.  I was just starting to look at that trace file
using vi when the system hung again!  This helps shine the light off of our
utilities (I think), though I also see that dd is not supposed to be used
against raw devices in the man pages.  The author of our utility is well aware
of the need to align the buffers, and the same code does work on other raw
devices on other LVM linux systems I have put together.

I need suggestions for debugging and/or other help to figure out what is going
on with this system. 
Gary Eheman
Fundamental Software, Inc.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]