[libvirt] [PATCH v2 RESEND 00/17] Introduce RDT memory bandwidth allocation support

John Ferlan jferlan at redhat.com
Mon Jul 30 22:14:57 UTC 2018



On 07/29/2018 11:12 PM, bing.niu at intel.com wrote:
> From: Bing Niu <bing.niu at intel.com>
> 
> This series is to introduce RDT memory bandwidth allocation support by extending
> current virresctrl implementation.
> 
> The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate
> control over memory bandwidth available per-core. This feature provides a method to
> control applications which may be over-utilizing bandwidth relative to their priority
> in environments such as the data-center. The details can be found in Intel's SDM 17.19.7.
> Kernel supports MBA through resctrl file system same as CAT. Each resctrl group have a
> MB parameter to control how much memory bandwidth it can utilize in unit of percentage.
> 
> In this series, MBA is enabled by enhancing existing virresctrl implementation. The
> policy employed for MBA is similar with CAT: The sum of each MBA group's bandwidth
> dose not exceed 100%.
> 
> The enhancement of virresctrl include two main parts:
> 
> Part 1:  Add two new structures virResctrlInfoMemMB and virResctrlAllocMemBW for collecting
>          host system MBA capability and domain memory bandwidth allocation. Those two
>          structures are the extension of existing virResctrlInfo and virResctrlAlloc. With
>          them, virresctrl framework can support MBA and CAT concurrently. Each virResctrlAlloc
>          represent a resource allocation including CAT, or MBA, or CAT&MBA. The policy of MBA is
>          that: total memory bandwidth of each resctrl group, created by virresctrl, does not
>          exceed to 100%.
> 
> Part 2:  On XML part, add new elements to host capabilities query and domain allocation to support
>          memory bandwidth allocation.
>      ---------------------------------------------------------------------------------------------
>          For host capabilities XML, new XML format like below example,
>            <host>
>             .....
>              <memory_bandwidth>
>                <node id='0' cpus='0-19'>
>                  <control granularity='10' min ='10' maxAllocs='8'/>
>                </node>
>              </memory_bandwidth>
>            </host>
> 
>            granularity --- memory bandwidth granularity
>            min         --- minimum memory bandwidth allowed
>            maxAllocs   --- maximum concurrent memory bandwidth allocation allowed.
> 
>      ---------------------------------------------------------------------------------------------
>          For domain XML, new format as below example
>            <domain type='kvm' id='2'>
>              ......
>              <cputune>
>                ......
>                <shares>1024</shares>
>                <memorytune vcpus='0-1'>
>                  <node id='0' bandwidth='20'/>
>                </memorytune>
>              </cputune>
>              ......
>            </domain>
> 
>           id         --- node where memory bandwidth allocation will happen
>           bandwidth  --- bandwidth allocated in percentage
> 
>      ----------------------------------------------------------------------------------------------
> 
> With this extension of the virresctrl, the overall working follow of CAT and MBA is described by below
> picture. XML parser will aggregate MBA and CAT configuration and represents it in one virresctrl object.
> The methods of virresctrl class will manipulate resctrl interface to allocate corresponding resources.
> 
> 
>      <memorytune cpus='0-3'>
>                     +---------+
>                               |    <cachetune vcpus='0-3'>
>         XML                   |           +
>        parser                 +-----------+
>                               |
>                               |
>                  +------------------------------+
>                               |
>                               |
> internal object        +------v--------------+  
> virResctrlAlloc        |   backing object    |
>                        +------+--------------+
>                               |
>                               |
>                  +------------------------------+
>                               |
>                            +--v-------+
>                            |          |
>                            | schemata |
>  /sys/fs/resctrl           | tasks    |
>                            |   .      |
>                            |   .      |
>                            |          |
>                            +----------+
> ---------------------------------------------------------------------
> 
> previous versions and discussion can be found at
>     v1: https://www.redhat.com/archives/libvir-list/2018-July/msg01144.html
> RFC v2: https://www.redhat.com/archives/libvir-list/2018-June/msg01268.html
> RFC v1: https://www.redhat.com/archives/libvir-list/2018-May/msg02101.html
> 
> Changelog:
>        v1 -> this: John's comment: 1. Split calculation of number of memory bandwidth control
>                                       to one patch.
>                                    2. Split virResctrlAllocMemBW relating methods to 5 patch, each
>                                       provides one kind of function, eg: schemata processing, memory
>                                       bandwidth calculation.....
>                                    3. Use resctrl to replace cachetune in domain conf.
>                                    4. Split refactor virDomainCachetuneDefParse into 3 patches. And
>                                       adjust some logic, eg: use %s format error log, renaming
>                                       functions.....
>                                    5. Complete doc description. eg: update cachetune part about vcpus
>                                       overlapping with memorytune, update libvirt version info for memory
>                                       bandwidth control availability.
>                                    6. Some coding style fix.
> 
>        RFC_v2->v1: John's comment: 1. use name MemBW to replace MB for a more clear description.
>                                    2. split rename patch and put refactor function part separately.
>                                    3. split virResctrlInfoMemMB and virResctrlAllocMemBW to different
>                                       patches.
>                                    4. add docs/schemas/*.rng for XML related patches.
>                                    5. some cleanup for coding conventions.
>        RFC_ v1->RFC_v2:
>             Jano's comment: 1. put renaming parts into separated patches.
>                             2. set the initial return value as -1.
>                             3. using full name in structure definition.
>                             4. do not use VIR_CACHE_TYPE_LAST for memory bandwidth allocation formatting.
> 
>             Pavel's comment: 1. add host capabilities XML for memory bandwidth allocation.
>                              2. do not mix use cachetune section in XML for memory bandwidth allocation in
>                                 domain XML. define a dedicated one for memory bandwidth allocation.
> 
> Bing Niu (17):
>   util: Rename some functions of virresctrl
>   util: Refactor virResctrlGetInfo in virresctrl
>   util: Refactor virResctrlAllocFormat of virresctrl
>   util: Add MBA capability information query to resctrl
>   util: Add MBA check to virResctrlInfoGetCache
>   util: Add MBA allocation to virresctrl
>   util: Add MBA schemata parse and format methods
>   util: Add support to calculate MBA utilization
>   util: Introduce virResctrlAllocForeachMemory
>   util: Introduce virResctrlAllocSetMemoryBandwidth
>   conf: Rename cachetune to resctrl
>   conf: Factor out vcpus parsing part from virDomainCachetuneDefParse
>   conf: Factor out vcpus overlapping from virDomainCachetuneDefParse
>   conf: Factor out virDomainResctrlDef update from
>     virDomainCachetuneDefParse
>   conf: Add support for memorytune XML processing for resctrl MBA
>   conf: Add return value check to virResctrlAllocForeachCache
>   conf: Add memory bandwidth allocation capability of host
> 
>  docs/formatdomain.html.in                          |  39 +-
>  docs/schemas/capability.rng                        |  33 ++
>  docs/schemas/domaincommon.rng                      |  17 +
>  src/conf/capabilities.c                            | 107 ++++
>  src/conf/capabilities.h                            |  11 +
>  src/conf/domain_conf.c                             | 428 ++++++++++++---
>  src/conf/domain_conf.h                             |  10 +-
>  src/libvirt_private.syms                           |   6 +-
>  src/qemu/qemu_domain.c                             |   2 +-
>  src/qemu/qemu_process.c                            |  18 +-
>  src/util/virresctrl.c                              | 611 +++++++++++++++++++--
>  src/util/virresctrl.h                              |  55 +-
>  .../memorytune-colliding-allocs.xml                |  30 +
>  .../memorytune-colliding-cachetune.xml             |  32 ++
>  tests/genericxml2xmlindata/memorytune.xml          |  33 ++
>  tests/genericxml2xmltest.c                         |   5 +
>  .../linux-resctrl/resctrl/info/MB/bandwidth_gran   |   1 +
>  .../linux-resctrl/resctrl/info/MB/min_bandwidth    |   1 +
>  .../linux-resctrl/resctrl/info/MB/num_closids      |   1 +
>  tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml   |   8 +
>  tests/virresctrldata/resctrl.schemata              |   1 +
>  21 files changed, 1280 insertions(+), 169 deletions(-)
>  create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml
>  create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml
>  create mode 100644 tests/genericxml2xmlindata/memorytune.xml
>  create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran
>  create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth
>  create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids
> 

Reviewed-by: John Ferlan <jferlan at redhat.com>
(series)

I'll push once the tree is open for 4.7.0 commits unless someone else
chimes in with other major issues that need to be addressed.

John




More information about the libvir-list mailing list