[libvirt] [PATCH] daemon: Dynamically create worker threads when some get stuck

Fri Jun 17 13:37:54 UTC 2011

On Fri, Jun 17, 2011 at 10:55:43AM +0100, Daniel P. Berrange wrote:
> On Thu, Jun 16, 2011 at 04:03:36PM -0400, Dave Allan wrote:
> > On Thu, Jun 16, 2011 at 06:29:09PM +0100, Daniel P. Berrange wrote:
> > > On Thu, Jun 16, 2011 at 04:29:55PM +0200, Michal Privoznik wrote:
> > > > Up to now, we've created new worker threads only during new connection.
> > > > This patch monitors worker threads for liveness and dynamically create
> > > > new one if all are stuck, waiting for hypervisor to reply. This
> > > > situation can happen. All one need to do is send STOP signal to
> > > qemu.
> > 
> > We will also need to consider the case of qemu processes in
> > uninterruptible sleep, e.g., in the case of a failed NFS mount.
> 
> That is no different as far as libvirt is concerned. The end result
> is always simply that libvirt sends a monitor command & does not
> get a response in an appropriate timeframe.

Except that locking behavior cannot depend on being able to kill the
process.

> > > > The amount of time when we evaluate thread as stuck is defined in
> > > > WORKER_TIMEOUT macro.
> > > > 
> > > > With this approach we don't need to create new worker thread on incoming
> > > > connection. However, as number of active worker threads grows, it might
> > > > happen we need to size up the pool of worker threads and hence exceed
> > > > the max_worker configuration value.
> > > 
> > > This is really not desirable. The max_workers limit is in the
> > > configuration as a static limit, to prevent client applications
> > > from making libvirtd spawn an unlimited number of threads. We
> > > must *always* respect the max_workers limit.
> > > 
> > > I don't think automatically spawning workers is the right way
> > > to deal with the QEMU issue anyway. As mentioned before, we need
> > > to improve the QEMU monitor driver so that we can safely allow
> > > monitor commands to time out
> > 
> > Dan, can you suggest some possible strategies here?  I don't have a
> > strong opinion on the implementation, although I agree with your
> > concern about spawning unlimited numbers of threads.  
> 
> As I mentioned, we need to make the QEMU monitor timeout after some
> period of time waiting, and ensure that the monitor for that VM cannot
> be used thereafter.
> 
> 
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|