[libvirt] [PATCH] storage: use btrfs file clone ioctl when possible

Daniel P. Berrange berrange at redhat.com
Mon Sep 30 08:45:57 UTC 2013


On Mon, Sep 30, 2013 at 01:21:18AM +0300, Oskari Saarenmaa wrote:
> On Fri, Sep 27, 2013 at 03:19:06PM +0100, Daniel P. Berrange wrote:
> > On Fri, Sep 27, 2013 at 05:02:53PM +0300, Oskari Saarenmaa wrote:
> > > Btrfs provides a copy-on-write clone ioctl so let's try to use it instead
> > > of copying files block by block.  The ioctl is executed unconditionally if
> > > it's available and we fall back to block copying if it fails, similarly to
> > > cp --reflink=auto.
> > 
> > Currently the virStorageVolCreateXMLFrom method does a full allocation
> > of storage when cloning volumes. This means applications can rely on
> > the image having enough space when clone completes and won't get ENOSPC
> > in the VM. AFAICT, this change to do copy-on-write changes the API to do
> > thin provisioning of the storage during clone, so any future write on
> > either the new or old volume may generate ENOSPC when btrfs finally copies
> > the sector. I don't think this is a good thing. I think applications
> > should have to explicitly request copy-on-write behaviour for the clone
> > so they know the implications.
> 
> That's a good point.  However, it looks like this change would only change
> the behavior for the old volumes; new volumes are always created sparsely
> and they may already get ENOSPC on write if they contained zero blocks. This
> should probably be fixed by calling fallocate instead of lseek when noticing
> empty blocks (safezero should probably be used instead, but it's currently
> rather unsafe if posix_fallocate isn't available.)
> 
> I was wondering if we could reuse the allocation and capacity fields to
> decide whether or not to try to do a cow-clone (or sparse allocation of the
> cloned bits)?  Currently a cloned volume's allocation is always set to at
> least the original volume's capacity and the original client-requested
> allocation value is not passed on to the code doing the cloning, but we
> could pass it on and allow copy-on-write clones if allocation is set to zero
> (no space is guaranteed to be available for writing) and also change sparse
> cloning to only happen if allocation is lower than capacity.

I think just having a VIR_STORAGE_VOL_CLONE_COPY_ON_WRITE flag for the
API would suffice.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the libvir-list mailing list