[linux-lvm] access through LVM causes D state lock up

Ray Morris support at bettercgi.com
Tue Dec 13 20:10:40 UTC 2011


On Tue, 13 Dec 2011 13:35:46 -0500
"Peter M. Petrakis" <peter.petrakis at canonical.com> wrote:


> Do you by any chance have active LVM snapshots? If so how many and
> how long have they been provisioned for?

I forgot to mention that. There are now three snapshots, one on each of
three LVs, that have been provisioned for a few hours. These LVs aren't 
in active use, but are backups, synced daily. So basically the only 
activity is rsync once daily, bandwidth limited to be fairly slow. One 
logical volume that locked up when trying to write to it had a snapshot.

Prior to this most recent rebuild, there were a lot of snap shots - 
three on each of fifteen LVs. I replaced that VG with a fresh one 
and it seemed to work for a while. I thought the problem was likely 
related to lots of long lived snapshots, but after completely rebuilding
the VG after deleting all snapshots the problem recurred very quickly, 
before there were many snapshots and before there was a lot of IO 
to the snaps

I realize I'm somewhat abusing snapshots - they weren't designed to 
be long lived. Therefore my "torture test" usage may reveal problems 
that wouldn't happen often with very short lived snapshots. 

Another similar server has more snapshots on more LVs running the 
same rsyncs without obvious trouble.

I should also have mentioned sequential writes to one LV at a time 
don't seem to trigger the problem. I copied the whole VG one LV 
at a time with:
dd if=/dev/oldvg/lv1 of=/dev/newvg/lv1

Copying the entire LVs sequentially saw no problems. Later when I tried 
to rsync to the LVs the problem showed itself.

> >>    filter = [ "a|^/dev/md.*|", "a|^/dev/sd.*|",
> >> "a|^/dev/etherd/.*|","r|^/dev/ram.*|", "r|block|", "r/.*/" ]
> >
> Is it intentional to include sd devices? Just because the MD uses
> them doesn't mean you have to make allowances for them here.


Some /dev/sdX devices were used, but no more and I have now removed 
sd.* and etherd.


> > <     locking_dir = "/var/lock/lvm"
> > ---
> >>     locking_dir = "/dev/shm"
> 
> Why?

This was changed AFTER the problem started.
Because comment in the file says:

  # Local non-LV directory that holds file-based locks while commands
  # are in progress.  

Because /var/lock is on an LV, I tried switching it to a directory that 
will never be on an LV. That didn't seem to have any effect.
-- 
Ray Morris
support at bettercgi.com

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php
-- 
Ray Morris
support at bettercgi.com

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php




More information about the linux-lvm mailing list