[linux-lvm] DM / LVM hangs if snapshot present on kernel v3.0.3
Mike Snitzer
snitzer at redhat.com
Mon Feb 20 13:51:04 UTC 2012
On Sat, Feb 18 2012 at 9:27pm -0500,
Spelic <spelic at shiftmail.org> wrote:
> Hello lists,
>
> Do you have any information about a bug in linux v3.0.3, of LVM
> snapshot making a mess at (clean!) reboot?
>
> Symptoms are: message at boot:
> [ 15.668799] device-mapper: table: 252:3: snapshot: Snapshot
> cow pairing for exception table handover failed
> [ 15.668934] device-mapper: ioctl: error adding target to table
> [ 19.388627] device-mapper: table: 252:3: snapshot: Snapshot
> cow pairing for exception table handover failed
> [ 19.388786] device-mapper: ioctl: error adding target to table
>
>
> and then the volume origin and snapshot come out inactive
> lvVM_TP1_d1 vgVM owc-i- 500.00g
> ...
> tp1d1-snap1 vgVM swi-i- 600.00g lvVM_TP1_d1 100.00 (*)
> (other volumes not having snapshot are active and working)
>
> (*) please note the size occupied in the snapshot is WRONG, it
> should be 4.56% and not 100%.
>
> At this point I did:
>
> # lvchange --refresh vgVM/tp1d1-snap1
> Couldn't find snapshot origin uuid LVM-WUPTe8bqp25OSeRsFcLpC228A6U0r84T22tfFj4EkWbuB6pP5UDTA7nVRfGSCZW7-real.
> # lvs
> ... *everything hangs* ..!!
>
> It hangs in DM code (too bad I lost the stack trace, sorry)
> I think the ssh session hanged at uninterruptible sleep, there was
> no kernel panic, I could indeed login again, however the DM devices
> were hanged bad so AFAIR I had to force a reboot without syncing or
> it would not complete the shutdown process.
>
>
> At reboot the situation at lvs is unchanged, with the two LVM
> devices (origin and snapshot) still inactive.
>
> This time I try refresh on the *origin*:
>
> # lvchange --refresh vgVM/lvVM_TP1_d1
> (no output)
> #
>
> and magically everything starts working!
> I can do lvs, dmsetup table is all filled, etc.
> Size occupied in snapshot shown in lvs is back to correct value 4.56%
>
> Then I reboot (clean!) again so to check that problems are solved now...
> Surprise!! The problems are back. The two devices, origin and
> snapshot, are again inactive.
>
> This time I think I learned the lesson and I refresh again *the origin*
> (I am SURE I used the origin, I triple checked that, I gave
> *exactly* the same command of the previous time)
>
> # lvchange --refresh vgVM/lvVM_TP1_d1
>
> Surprise!! everything hangs!!
>
> Like before, no kernel panic, however ssh session hangs and DM is
> unresponsive so I had to force a reboot without sync or it would not
> complete.
>
>
> At reboot again devices are inactive.
>
> At this point I am really fed up of LVM snapshots and I fear for our
> data, so I remove the snapshot with lvremove (I don't remember if I
> had to do lvchange --refresh on the origin before lvremove or not)
>
> As soon as I removed the snapshot everything started working flawlessly.
>
>
> I am very worried about this bug...
> We would need snapshot at work for performing live backups, but with
> this situation I don't know if I am risking more with snapshots or
> by not performing backups.
> Do you have any information on this bug, e.g. has this been fixed
> since 3.0.3?
I've never seen this.
Which distro are you using?
The "Snapshot cow pairing for exception table handover failed" is the
error path most commonly associated with the snapshot-merge feature.
Are you using snapshot-merge for the root LV (e.g. lvconvert --merge ...)?
Mike
More information about the linux-lvm
mailing list