[libvirt] Reproducible VM start bug which affects libvirt 5.1.0 and 5.2.0

Frank Schreuder fschreuder at transip.nl
Wed Apr 3 10:52:57 UTC 2019


Hello,

I am currently running into a reproducible libvirt bug which affects libvirt 5.1.0 and 5.2.0.

There seem to be a racecondition in the nwfilter-define and virsh start commands. Several times a day I'm not able to start a VM anymore with the following error message:
error: Failed to start domain test
error: internal error: Failed to apply firewall rules /sbin/iptables -w -I FORWARD 1 -j libvirt-in: iptables v1.6.0: Couldn't load target `libvirt-in':No such file or directory

To fix this issue I have to restart libvirt. Some iptable chains are missing, which is probably caused by a nwfilter-define operation.
I'm able to reproduce this bug within 2 hours by running 2 loops. One loop is defining nwfilters and the second loop is destroying and starting multiple VMs.

I found an entry in the changelog of libvirt 5.1.0, which seems related to this bug:
Create private chains for virtual network firewall rules
Historically firewall rules for virtual networks were added straight into the base chains. This works but has a number of bugs and design limitations. To address them, libvirt now puts firewall rules into its own chains. Note that with this change the filter, nat and mangle tables are required for both IPv4 and IPv6. 
 
So far I am not able to reproduce this bug on libvirt 5.0.0.

Is there any information I can provide to the mailinglist to help debug and/or fix this bug? I am also willing to test patches.

With kind regards,
Frank Schreuder





More information about the libvir-list mailing list