[libvirt] [PATCH 1/1] qemu: host NUMA hugepage policy without guest NUMA

Sam Bobroff sam.bobroff at au1.ibm.com
Tue Oct 25 02:10:23 UTC 2016


On Tue, Oct 18, 2016 at 10:43:31PM +0200, Martin Kletzander wrote:
> On Mon, Oct 17, 2016 at 03:45:09PM +1100, Sam Bobroff wrote:
> >On Fri, Oct 14, 2016 at 10:19:42AM +0200, Martin Kletzander wrote:
> >>On Fri, Oct 14, 2016 at 11:52:22AM +1100, Sam Bobroff wrote:
> >>>I did look at the libnuma and cgroups approaches, but I was concerned they
> >>>wouldn't work in this case, because of the way QEMU allocates memory when
> >>>mem-prealloc is used: the memory is allocated in the main process, before the
> >>>CPU threads are created. (This is based only on a bit of hacking and debugging
> >>>in QEMU, but it does seem explain the behaviour I've seen so far.)
> >>>
> >>
> >>But we use numactl before QEMU is exec()'d.
> >
> >Sorry, I jumped ahead a bit. I'll try to explain what I mean:
> >
> >I think the problem with using this method would be that the NUMA policy is
> >applied to all allocations by QEMU, not just ones related to the memory
> >backing. I'm not sure if that would cause a serious problem but it seems untidy,
> >and it doesn't happen in other situations (i.e. with separate memory backend
> >objects, QEMU sets up the policy specifically for each one and other
> >allocations aren't affected, AFAIK).  Presumably, if memory were very
> >restricted it could prevent the guest from starting.
> >
> 
> Yes, it is, that's what <numatune><memory/> does if you don't have any
> other (<memnode/>) specifics set.
> 
> >>>I think QEMU could be altered to move the preallocations into the VCPU
> >>>threads but it didn't seem trivial and I suspected the QEMU community would
> >>>point out that there was already a way to do it using backend objects.  Another
> >>>option would be to add a -host-nodes parameter to QEMU so that the policy can
> >>>be given without adding a memory backend object. (That seems like a more
> >>>reasonable change to QEMU.)
> >>>
> >>
> >>I think upstream won't like that, mostly because there is already a
> >>way.  And that is using memory-backend object.  I think we could just
> >>use that and disable changing it live.  But upstream will probably want
> >>that to be configurable or something.
> >
> >Right, but isn't this already an issue in the cases where libvirt is already
> >using memory backend objects and NUMA policy? (Or does libvirt already disable
> >changing it live in those situations?)
> >
> 
> It is.  I'm not trying to say libvirt is perfect.  There are bugs,
> e.g. like this one.  The problem is that we tried to do *everything*,
> but it's not currently possible.  I'm trying to explain how stuff works
> now.  It definitely needs some fixing, though.

OK :-)

Well, given our discussion, do you think it's worth a v2 of my original patch
or would it be better to drop it in favour of some broader change?

Cheers,
Sam.




More information about the libvir-list mailing list