[Libvir] Network blocking issue

Daniel Veillard veillard at redhat.com
Mon Sep 17 14:04:03 UTC 2007


On Thu, Sep 13, 2007 at 07:34:52PM +0100, Richard W.M. Jones wrote:
> Shuveb Hussain wrote:
> >Hi,
> >
> >I observed this while using the python bindings and accessing a remote
> >host with libvirt:
> >
> >>>>import libvirt
> >>>>c = libvirt.open('xen://veetee/')
> >>>>c.getInfo()
> >['i686', 2021, 2, 1864, 1, 1, 2, 1]
> >>>>c.getInfo()
> >['i686', 2021, 2, 1864, 1, 1, 2, 1]
> >
> ># remove network cable from remote machine now
> >>>>c.getInfo()
> ># blocks forever....
> >
> >What is the problem here and is there a solution to this? I am running
> >FC7 and here is the version info from virsh:
> >virsh # version
> >Compiled against library: libvir 0.3.2
> >Using library: libvir 0.3.2
> >Using API: Xen 3.0.1
> >Running hypervisor: Xen 3.1.0
> >
> >I observed this for more than 10 mins, it was still hung.
> 
> This is simply a TCP issue, and nothing to do with libvirt or the remote 
> protocol.
> 
> I repeated your experiment using a virsh shell and the nodeinfo command, 
> which essentially does the same thing.  After yanking the network cable 
> I observed that the sendto(2) syscall succeeded and the recvfrom(2) 
> syscall failed:
> 
> sendto(4, 
> "\27\3\1\1\20\246\325\207<\320\0230E<\352\4x\310E\1O*g\204!\254\n\234O
> N\23\310"..., 277, 0, NULL, 0) = 277
> recvfrom(4, [... strace hangs here ...]
> 
> On the wire I could see using tcpdump that TCP was repeatedly trying to 
> send the request packet and getting no response:
> 
> 19:25:17.108067 IP oirase.55065 > amd.16514: P 1474:1623(149) ack 1082 
> win 107 <nop,nop,timestamp 703462318 117574265>
> 19:25:17.108360 IP oirase.55065 > amd.16514: P 1623:1900(277) ack 1082 
> win 107 <nop,nop,timestamp 703462319 117574265>
> 19:25:17.308306 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703462519 117574265>
> 19:25:17.710212 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703462921 117574265>
> 19:25:18.514030 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703463725 117574265>
> 19:25:20.121667 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703465333 117574265>
> 19:25:23.336940 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703468549 117574265>
> 19:25:29.766483 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703474981 117574265>
> 19:25:42.625568 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703487845 117574265>
> 19:26:08.344739 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703513573 117574265>
> 19:32:42.572441 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 
> win 107 <nop,nop,timestamp 703907941 117574265>
> [etc]
> 
> On the broader issue, libvirt calls are synchronous -- this is done to 
> reduce the complexity of the interface and implementation.  If you need 
> them to be asychronous, use a separate thread (or process) to make the 
> calls.

  Is there a configuration knob in the RPC layer to lower the
timeout delay ? Some calls are slow, but we should not reach a 2mn
timeout, that's very very long I think.

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard at redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/




More information about the libvir-list mailing list