[libvirt-users] Source of Qcow2 Image Corruption

Andrew Martin amartin at xes-inc.com
Tue Aug 14 20:54:05 UTC 2012


Some additional information: 

Cloning the VM disk image with qemu-img convert results in an image that appears to be error free and can be mounted successfully: 
# qemu-img convert myvm.qcow2 -O qcow2 test.qcow2 
# qemu-img check test.qcow2 
No errors were found on the image. 


Also, the following error is logged in /var/log/libvirt/qemu/myvm.log when attempting to start the VM: 
# tail /var/log/libvirt/qemu/myvm.log 
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 8192 -smp 6 -name myvm -uuid 14a9dd6b-7a80-b286-8558-8c0c1f0324dc -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/myvm.monitor,server,nowait -monitor chardev:monitor -boot c -drive file=/mnt/storage/vmstore/disks/myvm.qcow2,if=virtio,index=0,boot=on,format=qcow2,cache=none -drive file=/dev/drbd1,if=virtio,index=1,format=raw -drive file=/dev/drbd2,if=virtio,index=2,format=raw -net nic,macaddr=00:16:3e:32:35:82,vlan=0,model=virtio,name=virtio.0 -net tap,fd=55,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus 
char device redirected to /dev/pts/0 
pci_add_option_rom: failed to find romfile "pxe-virtio.bin" 
qcow2_free_clusters failed: Invalid argument 


These systems are running the stock Ubuntu 10.04 version of qemu-common, qemu-kvm, and kvm (0.12.3+noroms-0ubuntu9.19). 


Thanks, 


Andrew
 
----- Original Message -----

From: "Andrew Martin" <amartin at xes-inc.com> 
To: libvirt-users at redhat.com 
Sent: Tuesday, August 14, 2012 2:23:45 PM 
Subject: [libvirt-users] Source of Qcow2 Image Corruption 

Hello, 

I have two KVM virtual machine nodes in a high-availability cluster using Pacemaker + Heartbeat on Ubuntu 10.04 Server amd64. This cluster hosts a single Ubuntu 10.04 VM which uses a qcow2 image file, myvm.qcow2, with a backing file, backingfile.qcow2. This morning, the VM suddenly powered off. I attempted to start it again with virsh start domain, but it would only start briefly and then power off again. I checked the qcow2 disk image and found countless corruption errors: 

root at vmhost:/mnt/storage/vmstore/disks# qemu-img info myvm.qcow2 
image: myvm.qcow2 
file format: qcow2 
virtual size: 9.8G (10485760000 bytes) 
disk size: 13G 
cluster_size: 65536 
backing file: backingfile.qcow2 (actual path: backingfile.qcow2) 
Snapshot list: 
ID TAG VM SIZE DATE VM CLOCK 
1.5G 2056-05-05 21:01:212795663:45:42.642 
/archive/1006/20100627000/2il_root/save/archive/1002/20100204005/1 743M 1995-08-16 12:47:352289751:06:20.183 
root at vmhost:/mnt/storage/vmstore/disks# qemu-img check myvm.qcow2 2>&1 | head 
ERROR OFLAG_COPIED: offset=80000002047d0000 refcount=0 
ERROR OFLAG_COPIED: offset=8000000212e50000 refcount=0 
ERROR OFLAG_COPIED: offset=80000001ffde0000 refcount=0 
ERROR OFLAG_COPIED: offset=80000001ff710000 refcount=0 
ERROR OFLAG_COPIED: offset=8000000216ec0000 refcount=0 
ERROR OFLAG_COPIED: offset=8000000206db0000 refcount=0 
ERROR OFLAG_COPIED: offset=80000001ff720000 refcount=0 
ERROR OFLAG_COPIED: offset=80000001ffdf0000 refcount=0 
ERROR OFLAG_COPIED: offset=8000000212e60000 refcount=0 
ERROR OFLAG_COPIED: offset=8000000212e70000 refcount=0 
root at vmhost:/mnt/storage/vmstore/disks# qemu-img info backingfile.qcow2 
image: backingfile.qcow2 
file format: qcow2 
virtual size: 9.8G (10485760000 bytes) 
disk size: 4.8G 
cluster_size: 65536 
root at vmhost:/mnt/storage/vmstore/disks# qemu-img check backingfile.qcow2 
No errors were found on the image. 

If I use qemu-img to convert the image, the resulting image is "clean": 
# convert myvm.qcow2 -O qcow2 /tmp/test.qcow2 
# qemu-img check /tmp/test.qcow2 
No errors were found on the image. 


I had this corruption happen a month ago to a different VM on the same machine but a different physical drive, so I do not believe it to be a physical disk failure. I can find nothing in /var/log that gives any more information related to this corruption. What other debug information can I provide to diagnose why these images are getting corrupted and taking these running VMs offline? 


Thanks, 


Andrew Martin 

_______________________________________________ 
libvirt-users mailing list 
libvirt-users at redhat.com 
https://www.redhat.com/mailman/listinfo/libvirt-users 




More information about the libvirt-users mailing list