[linux-lvm] strange behavior with 1.0.5 on Linux 2.4.19?

Tue Oct 8 15:07:01 UTC 2002

On Fri, 2002-10-04 at 01:50, Heinz J . Mauelshagen wrote:
> 
> Gregory,
> 
> running "lvcreate --size 8G --snapshot --name db1_snap vg01" should give
> a syntax error rather than "... doesn't exist".
> 
> Did you eventually run
> "lvcreate --size 8G --snapshot --name db1_snap /dev/vg01/db1"
> instead?

according to the manpage and help for lvcreate, the syntax was correct. 
The problem was that vg01 really didn't exist. :)

That, however, is not the problem.

> I guess the problem has disappeared after your reboot, right?
> If so, are you able to repeat the problem?

Kinda...

I'm hesitant to do any serious poking at it just yet, as the machine
having the problems is already in production and being heavily used.

Rebooting mostly clears up the problem.  After a reboot, I can use
lvdisplay and vgdisplay to my heart's content.  I can lvcreate and
lvremove lv's from the vg without problem.  vgscan and lvscan work as
expected.

It seems that the `vgchange -a n` in /etc/init.d/halt hangs, though.  On
other systems, this is not the case, even with the same version kernel &
LVM utilities (2.4.19 & 1.0.5, respectively).

another point is that this system is configured with root (/) on LVM
(/dev/vg00/root).  On other systems we have configured this way, we
occasionally get an error message that vgchange was unable to deactivate
the volume group because there were open volumes.  On this machine, it
just hangs, requiring a power-cycle to do a reboot (why did they stop
putting reset switches in Intel servers???)

I haven't tried creating a snapshot yet, mainly because I'm gun shy
right now, and I don't want to risk imposing brain damage on the system
in the middle of the week.

I'm also experiencing bizarre behavior with this system when doing any
operations involving rapid traversal of the filesystems, which again
does not happen on any of our other systems with identical software
loaded.  The best example is to run:

	find / > /dev/null

On any other system, this just forces the system to read every directory
on the filesystem.  not very useful, but it doesn't do much more than
take 10%CPU on a P-III 800 for a little bit.  On this system (Quad
1.6GHz Xeon, HyperThreading disabled, 8GB RAM, 8GB Swap, 100GB RAID-10),
running that command will eventually choke up the machine, forcing find
and kswapd to >40%CPU (according to top), and occasionally bringing
kupdated into the mix.

I'm currently trying to figure out if the problem is with LVM, if I need
to double the swap space to 16GB, or if I need to find a new driver for
the RAID card.  As it is, any intrusive testing on this system will have
to wait until I'm physically at the location with this server, which
will likely happen on or about Sat. Oct. 19, so i can have the machine
to myself on a weekend.

I'm starting to get desperate for a solution...

-- 
Gregory K. Ade <gkade at bigbrother.net>
http://bigbrother.net/~gkade
OpenPGP Key ID: EAF4844B  keyserver: pgpkeys.mit.edu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 232 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20021008/3190d3c5/attachment.sig>