Re: [linux-lvm] Missing error handling in lv_snapshot_remove

On Wed, Aug 07, 2013 at 11:13:53AM +0200, Zdenek Kabelac wrote:
> Dne 6.8.2013 19:37, Bastian Blank napsal(a):
> >I hold the cow device open so it will run into the error condition:
> >| $ sleep 100 < /dev/mapper/vg-test_snap-cow&
> You are breaking the lvm2 logic thus pushing the code to go through
> unexpected error code path - user is never supposed to open so
> called 'private' /dev/mapper/ devices.

I'm a developer and use it to trigger an error condition. Please don't
start with that crap about what a user should or should not do.

> >Then try to remove the LV:
> >| $ lvremove vg/test_snap
> With upstream lvm2 code - there is embedded 'retry' loop - so the removal
> should be retried for couple times (controllable by lvm.conf).

Please show that it actually does anything in this case. This is no
condition that goes away, but a logic bug.

> That's because udev WATCH rule might be fired basically anytime
> after close of device opened in write mode - so it may happen lvm2
> checks device is not opened and could be removed, but the udev WATCH
> rules opens temporarily device and lvm2 then fails to remove device,
> which has been previously detected as unused.

There is not udevd running! Please explain how udev can be a problem in
this case.

> There has been bug affecting cluster usage of exclusive snapshots in
> pre .99 version - the order of taking locks for devices was not
> correct, and if there
> has  been clvmd restart during snapshot - it has caused some problems.

Did you actually read the code? At least I can clearly see that the
error logic is broken.

> But for current (.99) code - in normal case the operation should
> work properly. For any unpredictable errors -  lvm2 command should
> print error message and it's up-to admin to fix dangling device and
> table entries.

It is up to LVM to not break the system with suspended devices.


Insufficient facts always invite danger.
		-- Spock, "Space Seed", stardate 3141.9

