[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [RFC][PATCHv2 00/11] add numatune command

On Thu, Nov 17, 2011 at 05:44:10PM +0800, Hu Tao wrote:
> This series does mainly two things:
>   1. use cgroup cpuset to manage numa parameters
>   2. add a virsh command numatune to allow user to change numa parameters
>      from command line
> Current numa parameters include nodeset and mode, but these cgroup cpuset
> provides don't completely match with them, details:
>  params           cpuset
>  ------------------------------------------------------
>  nodeset          cpuset provides cpuset.mems
>  mode strict      cpuset provides cpuset.mem_hardwall
>  mode interleave  cpuset provices cpuset.memory_spread_*
>  mode preferred   no equivalent. !spread to preferred?

This isn't right - there are only 3 existing configs in the
XML currently, current 'strict' does not map to mem_hardwall,
nor does interleave map to memory_spread AFAICT

Currently we have have three different configurations possible
for memory with the following semantics

  mode=strict        - allocation is from designated nodes, or fails
  mode=preferred     - allocation is from designated nodes, or falls back to other nodes
  mode=interleave    - allocation is interleaved across designated nodes

In cgroups cpuset controller you can set

   cpuset.mems - memory is allocated from designated nodes, or fails
   cpuset.mem_exclusive - no other cgroups, except parents, or children
                          can allocation from nos listed in cpuset.mems
   cpuset.mem_hardwall - no other cgroups are allowed to allocate from
                         the nodes listed in cpuset.mems
   cpuset.memory_spread* - control allocations of internal kernel data structures

IMHO, the last three are not really required for libvirt per VM
usage - the management application can trivially decide whether
to allow overlapping allocation between VMs without needing to
set this kernel tunable.

So, if using the cgroups cpuset controller for NUMA, the *only*
policy we can implement is mode=strict.  We cannot implement
mode=preferred or mode=interleave, given the currently available
cpuset controls.

IMHO, we should thus continue to use libnuma for specifying *all*
the policies, however, if mode=strict, then we should *also* apply
the policy in the cgroups using cpuset.mems since this will at
least allow later tuning of nodemask on the fly.

We will have to refuse any attempt to switch between different modes
on the fly. Only the nodemask, with mode=strict will be dynamically

|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]