[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [RFC PATCH 8/8] qemu: Set cpuset.mems even if the numatune mode is not strict



On Thu, May 09, 2013 at 06:22:17PM +0800, Osier Yang wrote:
> When the numatune memory mode is not "strict", the cpuset.mems
> inherits the parent's setting, which causes problem like:
> 
> % virsh dumpxml rhel6_local | grep interleave -2
>   <vcpu placement='static'>2</vcpu>
>   <numatune>
>     <memory mode='interleave' nodeset='1-2'/>
>   </numatune>
>   <os>
> 
> % cat /proc/3713/status | grep Mems_allowed_list
>   Mems_allowed_list:	0-3
> 
> % virsh numatune rhel6_local
>   numa_mode      : interleave
>   numa_nodeset   : 0-3

Yes the information is misleading.

> 
> Though the domain process's memory binding is set with libnuma
> after the cgroup setting.
> 
> The reason for only allowing "strict" mode in current code is the
> cpuset.mems doesn't understand the memory policy modes (interleave,
> prefered, strict), it actually equals to the "strict" mode ("strict"
> means the allocation will fail if the memory cannot be allocated on
> the target node. Default operation is to fall back to other nodes.

Default is localalloc.

> >From man numa(3)). However, writing the the cpuset.mems even if the
> numatune memory mode is not strict should be better than the blind
> inheritance anyway.

It's OK to interleave mode, combined with cpuset.memory_spread_xxx.
But what about preferred mode? comparing:

strict:  Strict means the allocation will fail if the memory cannot be
         allocated on the target node.

preferred: The system will attempt to allocate memory  from  the
           preferred node, but will fall back to other nodes if no
	   memory is available on the the preferred node. 

> 
> ---
> However, I'm not comfortable with the solution, since anyway the
> modes except "strict" are not meaningful for cpuset.mems.
> 
> Another problem what I'm not sure about is: If the cpuset.cpus will
> affect the libnuma setting? Assuming without this patch, domain
> process's cpuset.mems will be set as '0-7' (8 NUMA nodes, each has 8
> CPUs). And the numatune memory mode is "interleave", and libnuma set
> the memory binding as "1-2". Even with this patch applied, setting
> cpuset.mems as "1-2", any potential problem?
> 
> So this patch is mainly for raising up the problem, and to see if
> guys have any opinions. @hutao, since these codes are from you, any
> opinions/idea? Thanks.
> ---
>  src/qemu/qemu_cgroup.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
> index 33eebd7..22fe25b 100644
> --- a/src/qemu/qemu_cgroup.c
> +++ b/src/qemu/qemu_cgroup.c
> @@ -597,11 +597,9 @@ qemuSetupCpusetCgroup(virDomainObjPtr vm,
>      if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET))
>          return 0;
>  
> -    if ((vm->def->numatune.memory.nodemask ||
> -         (vm->def->numatune.memory.placement_mode ==
> -          VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)) &&
> -        vm->def->numatune.memory.mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT) {
> -
> +    if (vm->def->numatune.memory.nodemask ||
> +        (vm->def->numatune.memory.placement_mode ==
> +         VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)) {
>          if (vm->def->numatune.memory.placement_mode ==
>              VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)
>              mem_mask = virBitmapFormat(nodemask);
> @@ -614,6 +612,16 @@ qemuSetupCpusetCgroup(virDomainObjPtr vm,
>              goto cleanup;
>          }
>  
> +        if (vm->def->numatune.memory.mode ==
> +            VIR_DOMAIN_NUMATUNE_MEM_PREFERRED &&
> +            strlen(mem_mask) != 1) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                           _("NUMA memory tuning in 'preferred' mode "
> +                             "only supports single node"));
> +            goto cleanup;
> +
> +        }
> +
>          rc = virCgroupSetCpusetMems(priv->cgroup, mem_mask);
>  
>          if (rc != 0) {
> -- 
> 1.8.1.4
> 
> --
> libvir-list mailing list
> libvir-list redhat com
> https://www.redhat.com/mailman/listinfo/libvir-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]