[libvirt] [PATCH 4/5] qemu: Build command line for NUMA tuning

Bill Gray bgray at redhat.com
Fri May 6 21:04:07 UTC 2011


Looks like there is only a single call-back function -- 
qemudSecurityHook() -- which has had some cgroup and CPU affinity code 
already added in it.

Perhaps a good approach would be to add an invocation of a new function 
-- qemudInitMemAffinity() -- as a peer to the already present invocation 
of qemudInitCpuAffinity().  The qemudInitMemAffinity() function could 
use set_mempolicy() to bind or prefer local/specific memory (depending 
on whether the user specifies the explicit memory node list as mandatory 
or just advisory).  Advisory / preferred won't work correctly for large, 
multi-node guests until multiple nodes can be preferred (presumably 
selected by amount of free memory resources when multiple nodes are 
preferred).  It would also be helpful to have an additional attribute to 
specify interleaved memory.

How does this approach sound?


On 05/06/2011 09:24 AM, Daniel P. Berrange wrote:
> On Fri, May 06, 2011 at 09:20:18PM +0800, Osier Yang wrote:
>> 于 2011年05月06日 17:23, Daniel P. Berrange 写道:
>>> On Thu, May 05, 2011 at 04:30:30PM -0400, Bill Gray wrote:
>>>>
>>>> Hi Daniel,
>>>>
>>>> How can we get NUMA-aligned memory and CPUs if we apply binding APIs
>>>> after the process has already started?   Might not all the memory
>>>> already be allocated on the wrong nodes by then?
>>>
>>> The policy has to be set after fork'ing the new QEMU process, but
>>> before exec'ing QEMU. This is essentially what you're doing with
>>> numactl, but with the problem of an extra binary that screws up
>>> the SELinux domain transitions from libvirtd_t ->   svirt_t.
>>>
>>>> For expert users, what are the problems with starting qemu with an
>>>> external numactl command (with --cpunodebind and --membind) to
>>>> guarantee optimal alignment?
>>>
>>> Adding an intermediate process will prevent the neccessary SELinux
>>> domain transitions from working. We don't want to allow the
>>> numactl binary to be able to transition to svirt_t because that
>>> would be inappropriate for most users of numactl
>>
>> This make sense, as you said in another mail, perhaps we need to do some
>> work on __virExec, will make v2 series. Thanks for feedback.
>
> Not virExec, but rather in the QEMU exec hook function
>
>
> Daniel




More information about the libvir-list mailing list