[linux-lvm] LVM snapshots overflow due to initial "overhead"?
J. Javier Maestro
jjmaestro at ieee.org
Sun Nov 30 16:06:48 UTC 2008
Hi there,
I was thinking about using snapshots to build a backup system when I came
across this:
http://www.tldp.org/HOWTO/LVM-HOWTO/snapshots_backup.html
Full snapshot are automatically disabled
If the snapshot logical volume becomes full it will be dropped (become
unusable) so it is vitally important to allocate enough space. The amount
of space necessary is dependent on the usage of the snapshot, so there is
no set recipe to follow for this.
**If the snapshot size equals the origin size, it will never overflow.**
(** emphasis mine)
When I read that, I set up a testbed and did the following:
root at eden:~# lvcreate --name testing --size 4M vm
Logical volume "testing" created
root at eden:~# lvcreate --snapshot --name testing-snapshot --size 4M vm/testing
Logical volume "testing-snapshot" created
root at eden:~# lvs --units b /dev/vm/testing*
LV VG Attr LSize Origin Snap% Move Log Copy%
testing vm owi-a- 4194304B
testing-snapshot vm swi-a- 4194304B testing 0.39
I thought it was very weird that just by creating the snapshot, I would be
using 0.39% of it :-? That would mean that I could only snapshot 99.61% of
the actual LV! I thought, "Nonsense, the HOWTO clearly says that by creating
a snapshot the size of the LV, things cannot go wrong". So I tried,
root at eden:~# dd if=/dev/zero of=/dev/vm/testing
dd: writing to `/dev/vm/testing': No space left on device
8193+0 records in
8192+0 records out
4194304 bytes (4.2 MB) copied, 0.518928 s, 8.1 MB/s
root at eden:~# lvs --units b /dev/vm/testing*
/dev/vm/testing-snapshot: read failed after 0 of 4096 at 4128768: Input/output error
/dev/vm/testing-snapshot: read failed after 0 of 4096 at 0: Input/output error
LV VG Attr LSize Origin Snap% Move Log Copy%
testing vm owi-a- 4194304B
testing-snapshot vm Swi-I- 4194304B testing 100.00
So, it actually broke :-(
I wanted to see what the real usage was, because percents are not really
useful when I want to see what is going on. So I tweaked the lvm tools, just
adding one line to the percent calculation:
--- 8< -----------------------------------------------------------------------
--- dev_manager.c 2007-06-22 13:19:15.000000000 +0200
+++ dev_manager.jjmaestro.c 2008-11-22 20:31:59.000000000 +0100
@@ -391,6 +391,7 @@
else if (*percent < 0)
*percent = 100;
+ log_verbose("LV raw extent usage: %" PRIu64 " of %" PRIu64 " used", total_numerator, total_denominator);
log_debug("LV percent: %f", *percent);
r = 1;
--- 8< -----------------------------------------------------------------------
Now, when doing an lvdisplay -v I could see that right after creating a
snapshot, total_numerator was always 32, that in my case is 32*512 =
16384 or 2*8KB. That is, 2 chunks of the snapshot were always used just by
creating it!
Thus, the HOWTO is **very** wrong and the only way to be sure that a snapshot
wont overflow is making it bigger than the LV, in fact, I believe 2 chunks
bigger!
This is impossible, since apparently in my case I can only create LVs in 4M
chunks. This is not too bad, since it is marginal as the LV size is bigger,
but it is certainly annoying having to check what size the LV is and sum 4M
to be sure that nothing will ever break...
In any case, why do snapshots use those 8K upfront? (my guess is that it
holds a pointer translation table or something like that)
Could they not keep whatever info stored in those 8K somewhere else?
Why the lvcreate --snapshot sentence when missing the --size flag does not
create a snapshot "big enough" to avoid overflow? Is this an easy patch to
work on?
Thanks in advanced,
Cheers,
--
.''`.
: :' : J. Javier Maestro
`. `'` <jjmaestro at ieee.org>
`-
More information about the linux-lvm
mailing list