[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Libvir] Remote & the problem with qemu deadlock



As I wrote here: https://www.redhat.com/archives/libvir-list/2007-April/msg00114.html remote access to qemu:/// URLs currently deadlocks. A bit of explanation follows as to why this happens in the current architecture of the remote patch.

If you apply the remote patch right now, you'll get a modified libvirt_qemud server which can handle both qemu and remote requests over the same socket. (Basically both qemu_internal.c and remote_internal.c connect to the same Unix domain socket, then depending on the "program number" encoded within the RPC messages, they get dispatched accordingly inside libvirt_qemud). The server is written to handle multiple connections at once using non-blocking poll[1], but once the server has assembled a whole incoming message, it then blocks while dealing with that message.

The problem occurs in the qemu case when the remote server issues a call into qemu_internal (now linked into the server), and qemu_internal then tries to connect out to the qemu daemon. Unfortunately the qemu daemon /is/ the remote server, which is blocked handling the current call. Thus it is unable to accept the new connection (from itself) and completely deadlocks.

One solution suggested was to have qemu_internal recognise when it is linked into the server and make local calls back to its counterpart in the server. However I think this is pretty ugly and unnecessarily complicates qemu_internal which shouldn't care how it is linked.

My thoughts on the long term solution
-------------------------------------

I'd really like to see the qemu case not require a daemon. At the moment (AFAICS) the qemu daemon serves two purposes: (a) it keeps track of the monitor file descriptors of the existing qemu processes, and (b) it handles the networking stuff, basically starting and stopping dnsmasq[2]. (a) is easily handled, I think, by having qemu processes put their monitor sockets into a well-known directory. This has the other advantage that qemu processes can survive and continue after qemu programs restart. For the (b) case, I don't think networking which is a general service should be mixed up with qemu code.

I think (although probably I've missed some things) that this means that qemu_internal could be implemented entirely as a local service, requiring no daemon. (In the case where qemu_internal needs to control a root-owned system-wide set of qemu processes then we'd still have to have a daemon, but it can be the remote daemon, same way that Xen will work).

My suggestion for a short term solution
---------------------------------------

Since doing the above is way more than I'm going to do just to get remote working, my suggestion is that we have two daemons running, qemud and remote. Both would be built from the exact same codebase: the mode would be selected by having a 'libvirtd --remote' flag on the command line (or however, but the important thing is that they'd keep the same codebase).

This immediately fixes the deadlock problem[3] because qemu_internal linked to the server is now talking to a different daemon.

Rich.

[1] As an aside, I don't particularly like this architecture. Linus has handed down to us an efficient operating system which is already capable of multiplexing different clients using a marvellous system called "multitasking". Hand-coding polling code inside a single process is really only necessary in very limited circumstances, such as highly tuned uniprocessor machines with enormous loads, and we are not likely to get into such a situation with libvirtd. If we ever did we'd need to benchmark a range of possible solutions. Processes on the other hand have real advantages, such as security through isolation and controlled sharing, and easy scalability onto SMP machines.

[2] Although I don't really understand the networking stuff too well, so perhaps it does other things.

[3] Except in the "remote-on-remote" case, but we should disallow this explicitly because it seems like it would be nothing but trouble.

--
Emerging Technologies, Red Hat  http://et.redhat.com/~rjones/
64 Baker Street, London, W1U 7DF     Mobile: +44 7866 314 421

Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod
Street, Windsor, Berkshire, SL4 1TE, United Kingdom.
Registered in England and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (USA), Charlie Peters (USA) and David
Owens (Ireland)

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]