[libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration

John Ferlan jferlan at redhat.com
Fri Oct 12 22:29:41 UTC 2018



On 10/12/18 3:10 AM, Wang, Huaqiang wrote:
> 
>> -----Original Message-----
>> From: John Ferlan [mailto:jferlan at redhat.com]
>> Sent: Thursday, October 11, 2018 4:58 AM
>> To: Wang, Huaqiang <huaqiang.wang at intel.com>; libvir-list at redhat.com
>> Cc: Feng, Shaohe <shaohe.feng at intel.com>; Niu, Bing <bing.niu at intel.com>;
>> Ding, Jian-feng <jian-feng.ding at intel.com>; Zang, Rui <rui.zang at intel.com>
>> Subject: Re: [libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration
>>
>>
>>
>> On 10/9/18 6:30 AM, Wang Huaqiang wrote:
>>> Introducing <monitor> element under <cachetune> to represent a cache
>>> monitor.
>>>
>>> Supports two kind of monitors, which are, monitor under default
>>> allocation or monitor under particular allocation.
>>
>> I still don't see what the big difference is with singling out default.
>> Does it really matter - what advantage is there. Having a monitor for 'n' vcpus is
>> a choice.
>>
>>
>>>
>>> Monitor supervises the cache or memory bandwidth usage for
>>
>> At this point memory bandwidth is not in play...
> 
> Yes. Will remove the memory BW related words.
> 
>>
>>> interested vcpu thread set, if the vcpu thread set is belong to some
>>> resctrl allocation, then the monitor will be created under this
>>> allocation, that is, creating a resctrl monitoring group directory
>>> under the directory of '@alloc->path/mon_group'. Otherwise, the
>>> monitor will be created under default allocation.
>>
>> This is becoming increasing difficult to describe/decipher - makes me wonder
>> who would really use it.
> 
> Monitor in default allocation could be commonly used when we want to
> observe the resource usage for VM vcpu threads that are not specified
> with any dedicated resource limitation through CAT or MBA technology.
> 
> And another common case is it will be used if CAT/MBA is not supported
> but CMT/MBM is enabled in libvirt.
> 
> The reason I have introduced the monitor in previous patch and further
> introducing the monitor in default allocation in this patch is the monitor
> n two different allocations has different behavior.
> 
> Let's focus on CAT and CMT technology in my below lines, since MBA and
> MBM are very similar cases to them.
> 
> As we know, before the CAT technology was introduced, any process in Linux
> is sharing the L3 cache along with all other active processes. After
> CAT is enabled in libvirt, it has the capability to apply cache isolation
> and assign dedicated amount of cache to some key VM vcpu threads.
> 
> If we want to observe the real L3 cache usage information, then we
> need the help of monitor.
> 
> == Monitor usage case 1: monitor in default allocation ==
> 
> If you want to get the cache utilization data before applying any cache
> isolation to the VM vcpu threads, you need to create a monitor in the
> default allocation, because you haven't create any cache allocation.
> 
> == Monitor usage case 2: monitor in non-default allocation ==
> 
> If you have created a cache allocation for specific VM vcpu treads
> and not sure how many cache-lines these VM vcpu threads are really used,
> you need to create a monitor under the this allocation to get real cache
> usage information.
> 
> If you find your VM vcpu threads only used a small part of cache that
> have allocated, you might think about to reduce its allocation.
> 
> == Usage for default monitor and non-default monitor ==
> 
> Since we have introduced the 'default monitor' and 'non-default monitor'
> concepts in previous patches, and now, you can monitor the cache usage
> for all VM vcpu threads that added to this allocation, and also you has
> the capability to monitor a subset of vcpu list of this allocation.
> 
> Without 'default monitor', it is impossible to get the cache usage for
> all vcpu threads in the allocation and at the same time get the cache
> usage for some highly interested vcpu threads of allocation.

"Default monitor" is in name only. It's just "a" monitor for all the
threads in cachetune (or memorytune eventually); whereas, a "non default
monitor" is "a" monitor for specific vcpus within the range in cachetune.

Thus your description is that a monitor can be a monitor all vcpus or a
monitor for some subset of of all vcpus. A <monitor> describes which
vcpus of a cachetune (or memorytune) can be monitored - there are 3
options - "all", "m-to-n", or "one-to-one".

What they relate to in the filesystem is the magic of the code of paths
to the data. What data structures are called or how this is described in
docs just doesn't seem to need to make a delineation.

> 
> == Monitor usage case 3: allocating dedicated cache and monitoring its
> usage of one VM, and getting its influence over another VM==
> 
> Think about the scenario that there are two VMs, it is a known information
> that one VM is very cache sensitive and don't want to share cache with
> other workloads, and for another VM, we have no knowledge about cache
> requirement, but it is required to monitor the cache usage for the two
> VMs.
> 
> With the concepts introduced until now. We need to create an allocation
> and for this VM, then create a default-monitor in this allocation for
> monitoring. For another VM, it is required to create a non-default
> monitor under default allocation.
> 
> 
> With introduced concepts, the allocation, the default allocation, the monitor
> and the default monitor, it is possible to fulfill requirement all scenarios.

I think it's far easier to describe 2 things rather than 4.

> 
> The creation of monitor does not have too much flexibility, as I stated
> in my reply of v5patch0's review comments, the monitors need to follow
> below rules:
> 
> 1. In each <cachetune> entry more than one monitors could be specified.
> 2. In each <cachetune> entry up to one allocation could be specified.
> 3. The allocation is using the vcpu list specified in <cachetune> attribute
> 'vcpus'.

An allocation is either from the list of vcpus for a cachetune or a
memorytune if cachetune for the listed vcpus in a memorytune doesn't
exist (and yes, doesn't conflict with any vcpus in any found cachetune).

If the domain was created with 4 vcpus, then is "theoretically" the
default allocation for 4 vcpus. Allocations beyond that allow one to
slice and dice *as long as* there is no overlap with the definitions.

I don't think "default" or "hidden" really needs to be described or
called out, it just is. Maybe I'm oversimplifying.

> 4. A monitor has the same vcpu list as allocation is allowed, and this
> monitor is allocation's default monitor.
> 5. A monitor has a subset vcpu list of allocation is allowed.
> 6. For non-default monitors, any vcpu list overlap is not permitted.
> And also, the number of monitors could be generated is subject to
> the hardware RMID numbers.
> 
> About the behavior of underlying '/sys/fs/resctrl' file system, you can
> get more detail from this link:
> https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
> 
>>
>>>
>>> For default allocation monitor, it will have such kind of XML layout:
>>>
>>>     <cachetune vcpus='1'>
>>>       <monitor level=3 vcpus='1'/>
>>>     </cachetune>
>>>
>>> For other type monitor, the XML layout will be something like:
>>>
>>>     <cachetune vcpus='2-4'>
>>>       <cache id='0' level='3' type='both' size='3' unit='MiB'/>
>>>       <cache id='1' level='3' type='both' size='3' unit='MiB'/>
>>>       <monitor level=3 vcpus='2'/>
>>>     </cachetune>
>>>
>>
>> Since all we get is the "cache_occupancy" that would seem to me to be most
>> important when there is a <cache> bank, right?  So what would the first (or your
>> default style) collect/display. Or does cache not really matter.
> 
> <cache> does not matter for monitor.
> Even in case of no CAT in system, the <cache> entry will never be shown
> under <cachetune>.
> 
> If valid <cache> entries exist, means allocating some part of cache-lines to
> VM vcpu threads specified in attribute 'vcpus'.
> If there is no <cache> entry existed in <cachetune>, it does not mean these
> vcpu threads does not use any cache resource, it means it will use the
> cache resource specified in default allocation. Normally the default
> allocation's cache resource is shared by a lot of cache insensitive workloads.
> 

Right it's not overriding anything. It's just using the values for all
vcpus present, but that's an implementation detail of the underlying
technology which it feels like could be "papered over".

>>
>> IOW: What is cache_occupancy measuring? Each cache? The entire thing? If
>> there's no cache elements, then what?
> 
> cache_occupancy is measuring based on cache bank. For Intel 2 socket xeon CPU,
> it is considered as two cache banks, one cache bank per socket. The typical
> output for each monitor of this case is:
> 
>      cpu.cache.0.name=vcpus_1
>      cpu.cache.0.vcpus=1
>      cpu.cache.0.bank.count=2          <--- 2 cache banks
>      cpu.cache.0.bank.0.id=0           <--- bank.id.0 cache_occypancy
>      cpu.cache.0.bank.0.bytes=9371648    _|
>      cpu.cache.0.bank.1.id=1           <--- bank.id.1 cache_occypancy
>      cpu.cache.0.bank.1.bytes=1081344    _|
> 
> If you want to know the total cache occupancy for VM vcpu threads of this
> monitor, you need to add them up.
> 

So if you have:

   <monitor... vcpus=0-1>

what do you get in output for cache_occupancy? 0 + 1?

>>
>> I honestly think this just needs to be simplified as much as possible.
> 
>> When you monitor specific vcpus within a cachetune, then you get what?
> 
> In this case, the monitor you created only monitors the specific vcpus
> you added for monitor.
> 
> Following two configurations satisfy your scenario, and the only monitor
> will detect the cache usage of thread of vcpu 2.
> 
>      <cachetune vcpus='2-4'>
>        <cache id='0' level='3' type='both' size='3' unit='MiB'/>
>        <cache id='1' level='3' type='both' size='3' unit='MiB'/>
>        <monitor level=3 vcpus='2'/>
>      </cachetune>
> 
>      <cachetune vcpus='2-4'>
>        <monitor level=3 vcpus='2'/>
>      </cachetune>
> 

Perhaps my question was mistyped or misinterpreted. In the above top
example, if we have <monitor ... vcpus='2-4'>, then do the values in
<cache> have any impact on the calculation as opposed to if they weren't
there?

> 
> 
>> If the cachetune has no specific cache entries, you get what? 
> 
> If no cache entry in cachetune, it will also get vcpu threads' cache
> utilization information based on cache bank.
> No cache entry specified for the cachetune, means it will use the cache
> allocating policy of default cache allocation, which is file
> /sys/fs/resctrl/schemata.
> 
> If valid cache entries are provided in cachetune, then an allocation will
> be created for the threads of vcpu listed in <cachetune> 'vcpus'
> attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0,
> then the cache resource limitation was applied on these threads.
> 
> For monitor, it does not care if vcpu threads are allowed or not alloowed to
> access a limit amount of cache-lines. Monitor only reports the amount of
> cache has been accesses.
> 
>> If you monitor
>> multiple vcpus within a cachetune then you get what? (?An aggregation of all?).
> 
> Yes.
> supposing you have this vcpus setting for <cachetune>
>     <cachetune vcpus='0-4,8' ..../>
> 
> and you choose to monitor the cache usage for vcpu 0,3,8, then you create
> following monitor entry inside the cachetune entry, with the output of
> monitor, you will get an aggregative cache occupancy information for threads
> of vcpu 0,3,8.
> 
>     <cachetune vcpus='0-4,8'/>
>       <monitor level='3' vcpus='0,3,8'/>
>     </cachetune>
> 
>> This whole default and specific description doesn't make sense.
> 
> Sorry for make you confused, I'll try to refine the descriptions.
> 

In this last case if you also had

   <monitor level='3', vcpus='4'/>
   <monitor level='3', vcpus='0-4,8'/>

then I'd expect that the values output in "0-4,8" to match those that I
could add by myself with "4" and "0-3,8".  True?

Is it apparent yet why I'm saying mentioning default just confuses
things?  If so, I'm not sure what else I can do to explain.

>>
>>> Signed-off-by: Wang Huaqiang <huaqiang.wang at intel.com>
>>> ---
>>>  docs/formatdomain.html.in                          |  26 +++
>>>  docs/schemas/domaincommon.rng                      |  10 +
>>>  src/conf/domain_conf.c                             | 217 ++++++++++++++++++++-
>>>  src/conf/domain_conf.h                             |  11 ++
>>>  tests/genericxml2xmlindata/cachetune-cdp.xml       |   3 +
>>>  .../cachetune-colliding-monitor.xml                |  30 +++
>>>  tests/genericxml2xmlindata/cachetune-small.xml     |   7 +
>>>  tests/genericxml2xmltest.c                         |   2 +
>>>  8 files changed, 301 insertions(+), 5 deletions(-)  create mode
>>> 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>>
>>> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
>>> index b1651e3..2fd665c 100644
>>> --- a/docs/formatdomain.html.in
>>> +++ b/docs/formatdomain.html.in
>>> @@ -759,6 +759,12 @@
>>>      <cachetune vcpus='0-3'>
>>>        <cache id='0' level='3' type='both' size='3' unit='MiB'/>
>>>        <cache id='1' level='3' type='both' size='3' unit='MiB'/>
>>> +      <monitor level='3' vcpus='1'/>
>>> +      <monitor level='3' vcpus='0-3'/>
>>> +    </cachetune>
>>> +    <cachetune vcpus='4-5'>
>>> +      <monitor level='3' vcpus='4'/>
>>> +      <monitor level='3' vcpus='5'/>
>>>      </cachetune>
>>>      <memorytune vcpus='0-3'>
>>>        <node id='0' bandwidth='60'/> @@ -978,6 +984,26 @@
>>>                </dd>
>>>              </dl>
>>>            </dd>
>>> +          <dt><code>monitor</code></dt>
>>> +          <dd>
>>> +            The optional element <code>monitor</code> creates the cache
>>> +            monitor(s) for current cache allocation and has the following
>>> +            required attributes:
>>> +            <dl>
>>> +              <dt><code>level</code></dt>
>>> +              <dd>
>>> +                Host cache level the monitor belongs to.
>>> +              </dd>
>>> +              <dt><code>vcpus</code></dt>
>>> +              <dd>
>>> +                vCPU list the monitor applies to. A monitor's vCPU list
>>> +                can only be the member(s) of the vCPU list of associating
>>> +                allocation. The default monitor has the same vCPU list as the
>>> +                associating allocation. For non-default monitors, there
>>> +                are no vCPU overlap permitted.
>>> +              </dd>
>>> +            </dl>
>>> +          </dd>
>>>          </dl>
>>>        </dd>
>>>
>>> diff --git a/docs/schemas/domaincommon.rng
>>> b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644
>>> --- a/docs/schemas/domaincommon.rng
>>> +++ b/docs/schemas/domaincommon.rng
>>> @@ -981,6 +981,16 @@
>>>                  </optional>
>>>                </element>
>>>              </zeroOrMore>
>>> +            <zeroOrMore>
>>> +              <element name="monitor">
>>> +                <attribute name="level">
>>> +                  <ref name='unsignedInt'/>
>>> +                </attribute>
>>> +                <attribute name="vcpus">
>>> +                  <ref name='cpuset'/>
>>> +                </attribute>
>>> +              </element>
>>> +            </zeroOrMore>
>>>            </element>
>>>          </zeroOrMore>
>>>          <zeroOrMore>
>>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index
>>> 9a514a6..4f4604f 100644
>>> --- a/src/conf/domain_conf.c
>>> +++ b/src/conf/domain_conf.c
>>> @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr
>>> loader)
>>>
>>>
>>>  static void
>>> +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) {
>>> +    if (!domresmon)
>>> +        return;
>>> +
>>> +    virBitmapFree(domresmon->vcpus);
>>> +    virObjectUnref(domresmon->instance);
>>
>>     VIR_FREE(domresmon);
> 
> I forget to free monitor itself. Will be fixed.
> 

Coverity found this!

>>
>>> +}
>>> +
>>> +
>>> +static void
>>>  virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl)  {
>>> +    size_t i = 0;
>>> +
>>>      if (!resctrl)
>>>          return;
>>>
>>> +    for (i = 0; i < resctrl->nmonitors; i++)
>>> +        virDomainResctrlMonDefFree(resctrl->monitors[i]);
>>> +
>>>      virObjectUnref(resctrl->alloc);
>>>      virBitmapFree(resctrl->vcpus);
>>> +    VIR_FREE(resctrl->monitors);
>>>      VIR_FREE(resctrl);
>>>  }
>>>
>>> @@ -18919,6 +18936,154 @@
>> virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt,
>>>      return ret;
>>>  }
>>>
>>
>> Two blank lines
> 
> OK.
> 
>>
>>> +/* Checking if the monitor's vcpus is conflicted with existing
>>> +allocation
>>> + * and monitors.
>>> + *
>>> + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a
>>> +default
>>> + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict
>>> +and
>>> + * @vcpus is not equal to @resctrl->vcpus.
>>> + * */
>>> +static int
>>> +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl,
>>> +                                virBitmapPtr vcpus)
>>
>> This should be *ValidateVcpus.
> 
> OK.
> 
>>
>>> +{
>>> +    size_t i = 0;
>>> +    int vcpu = -1;
>>> +
>>> +    if (virBitmapIsAllClear(vcpus)) {
>>> +        virReportError(VIR_ERR_INVALID_ARG, "%s",
>>> +                       _("vcpus is empty"));
>>> +        return -1;
>>> +    }
>>> +
>>> +    while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) {
>>> +        if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) {
>>> +            virReportError(VIR_ERR_INVALID_ARG, "%s",
>>> +                           _("Monitor vcpus conflicts with allocation"));
>>> +            return -1;
>>> +        }
>>> +    }
>>> +
>>> +    if (resctrl->alloc && virBitmapEqual(vcpus, resctrl->vcpus))
>>
>> The ->alloc check is confusing in light of having a monitor as a child of
>> cachetune.
> 
> I am looking cachetune as an entity with the capability to do the performance
> tuning tasks by a better cache resource arrangement.
> 
> In a general performance tuning taks, first we need to get to know the
> resource utilization information and then applies some resource arrangement,
> such as cache isolation or allocation. @resctrl->alloc represents the
> role for resource arranging, @resctrl->monitors are for the performance
> detecting/monitoring part.
> 
> Performance monitoring belongs to the scope of performance tuning, just like
> the part doing the resource limitation. Based on this understanding, I
> combined @alloc and @monitors, and let them as a child of @resctrl.
> 
> If you still think it is strange to put @monitors under @resctrl, then I
> have to change it, creating a data structure at the level of @def->resctrls,
> then the new _virDomainDef structure would be:

I don't believe this is what I was saying.

> 
>     struct _virDomainDef {
>         ...
>         virDomainResctrlDefPtr *resctrls;
>         size_t nresctrls;
> 
>         virDomainResctrlMonDefPtr *monitors;
>         size_t nmonitors;
>         ...
>     }
> 
> It seems the "virDomainResctrlDefPtr" should be refactored by reducing its
> scope that the name implies.
> 
> Further, even have refactored like this, I have to add a lot of code to maintain
> the relationship between @resctrls->vcpus and each one of @monitors->vcpus.
> 
>>
>>> +        return 1;
>>> +
>>> +    for (i = 0; i < resctrl->nmonitors; i++) {
>>> +        if (virBitmapEqual(resctrl->vcpus, resctrl->monitors[i]->vcpus))
>>> +            continue;
>>> +
>>> +        if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) {
>>> +            virReportError(VIR_ERR_INVALID_ARG, "%s",
>>> +                           _("Monitor vcpus conflicts with
>>> + monitors"));
>>> +
>>> +            return -1;
>>> +        }
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +
>>> +static int
>>> +virDomainResctrlMonDefParse(virDomainDefPtr def,
>>> +                            xmlXPathContextPtr ctxt,
>>> +                            xmlNodePtr node,
>>> +                            virResctrlMonitorType tag,
>>> +                            virDomainResctrlDefPtr resctrl) {
>>> +    virDomainResctrlMonDefPtr domresmon = NULL;
>>> +    xmlNodePtr oldnode = ctxt->node;
>>> +    xmlNodePtr *nodes = NULL;
>>> +    unsigned int level = 0;
>>> +    char * tmp = NULL;
>>> +    char * id = NULL;
>>> +    size_t i = 0;
>>> +    int n = 0;
>>> +    int rv = -1;
>>> +    int ret = -1;
>>> +
>>> +    ctxt->node = node;
>>> +
>>> +    if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) < 0) {
>>> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>>> +                       _("Cannot extract monitor nodes"));
>>> +        goto cleanup;
>>> +    }
>>> +
>>> +    for (i = 0; i < n; i++) {
>>> +
>>> +        if (VIR_ALLOC(domresmon) < 0)
>>> +            goto cleanup;
>>> +
>>> +        domresmon->tag = tag;
>>> +
>>> +        domresmon->instance = virResctrlMonitorNew();
>>> +        if (!domresmon->instance) {
>>> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>>> +                           _("Could not create monitor"));
>>> +            goto cleanup;
>>> +        }
>>> +
>>> +        if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) {
>>> +            tmp = virXMLPropString(nodes[i], "level");
>>> +            if (!tmp) {
>>> +                virReportError(VIR_ERR_XML_ERROR, "%s",
>>> +                               _("Missing monitor attribute 'level'"));
>>> +                goto cleanup;
>>> +            }
>>> +
>>> +            if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) {
>>> +                virReportError(VIR_ERR_XML_ERROR,
>>> +                               _("Invalid monitor attribute 'level' value '%s'"),
>>> +                               tmp);
>>> +                goto cleanup;
>>> +            }
>>> +
>>> +            if (virResctrlMonitorSetCacheLevel(domresmon->instance, level) < 0)
>>> +                goto cleanup;
>>> +
>>> +            VIR_FREE(tmp);
>>> +        }
>>> +
>>> +        if (virDomainResctrlParseVcpus(def, nodes[i], &domresmon->vcpus) < 0)
>>> +            goto cleanup;
>>> +
>>> +        rv = virDomainResctrlMonValidateVcpu(resctrl,
>>> + domresmon->vcpus);
>>> +
>>> +        /* If monitor's vcpu list is identical to allocation's vcpu list,
>>> +         * set as default monitor */
>>> +        if (rv == 1 && resctrl->alloc)
>>
>> I'm still not seeing the need for default... 
> 
> As I stated, default monitor, monitor, and monitor in default allocation has
> different character and behavior in resctrl, they are necessary.
> If we don't support default monitor, we could not monitor cache usage of whole
> allocation and also monitor some the cache usage of a subset of allocation
> vcpus at the same time.
> If we don't support default allocation, we will have to create many
> duplicated allocations that has same cache settings, that is sharing same
> cache resources.
> In that case, we'll lose the flexibility that kernel resctrl fs provided.
> 
>> FWIW: The resctrl->alloc check is
>> unnecessary since the only way rv == 1 is if it's there.
> 
> You are right, it is double checked, changes to 
> 
>     if (rv == 1)
>         virResctrlMonitorSetDefault(domresmon->instance);                        
> 
>>
>>> +            virResctrlMonitorSetDefault(domresmon->instance);
>>> +        else if (rv < 0)
>>> +            goto cleanup;
>>> +
>>> +        if (!(tmp = virBitmapFormat(domresmon->vcpus)))
>>> +            goto cleanup;
>>> +
>>> +        if (virAsprintf(&id, "vcpus_%s", tmp) < 0)
>>> +            goto cleanup;
>>> +
>>> +        if (virResctrlMonitorSetID(domresmon->instance, id) < 0)
>>> +            goto cleanup;
>>> +
>>> +        if (VIR_APPEND_ELEMENT(resctrl->monitors,
>>> +                               resctrl->nmonitors,
>>> +                               domresmon) < 0)
>>> +            goto cleanup;
>>> +
>>> +        VIR_FREE(id);
>>> +        VIR_FREE(tmp);
>>> +        domresmon = NULL;
>>> +    }
>>> +
>>> +    ret = 0;
>>> + cleanup:
>>> +    ctxt->node = oldnode;
>>> +    VIR_FREE(id);
>>> +    VIR_FREE(tmp);
>>> +    virDomainResctrlMonDefFree(domresmon);
>>
>>      VIR_FREE(nodes);
> 
> I forget to free it. Will be added.
> 

Again, Coverity

>>
>>> +    return ret;
>>> +}
>>> +
>>>
>>>  static virDomainResctrlDefPtr
>>>  virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20
>>> @@ virDomainCachetuneDefParse(virDomainDefPtr def,
>>>          }
>>>      }
>>>
>>> -    if (virResctrlAllocIsEmpty(alloc)) {
>>> -        ret = 0;
>>> -        goto cleanup;
>>> -    }
>>> -
>>>      resctrl = virDomainResctrlNew(alloc, vcpus);
>>
>> So if @alloc == NULL (which is one of the reasons AllocIsEmpty is true) means
>> that we'll be virObjectRef(NULL) in ResctrlNew(). In this case @alloc could be
>> NULL if there were no <cache> entries, IIUC.
> 
> Your understanding is right.
> 
>>
>> I'm starting to see a real downside to a resctrl->alloc == NULL. I really don't
>> want to continue to see churn on the internal hierarchy though.
> 
> The downside, what you mean is the 'churn on the internal hierarchy though'?
> 

The downside is it makes no sense to have resctrl->alloc == NULL if a
<cachetune> or <memorytune> element exists.

If there's a <cachetune> or a <memorytune>, then a resctrl->alloc is
created for the listed vcpus. Everything builds off of that.

The <monitor> elements are children of cachetune (and I assume
eventually) memorytune. The fact that the filesystem has this other
"default" thing means what when a cachetune or memorytune element is
found - it's a child of the default thing

>>
>> However, if I look at how it could be filled in within the context of this function,
>> the virDomainResctrlVcpuMatch call and subsequent possible allocation of
>> @alloc would seemingly be possible outside the context of whether specific
>> <cache> entries existed.
> 
> If no <cache> entry, then 'n = virXPathNodeSet("./cache", ctxt, &nodes)'
> statement will assign n to 0. If n is 0, then virDomainResctrlVcpuMatch will
> not be called and its subsequent steps for creating @alloc will not be
> executed.
> 
>> Probably could just do away with the term default
>> allocation - all it seems to be is an allocation without <cache> elements, but it
>> can have <monitor> elements.
> 
> Correct, this is how code works.
> 
>> If someone places a <cachetune> without <cache> and without <monitor>, so
>> what - who cares. Probably doesn't do much other than limit other <cachetune>
>> (and perhaps <memorytune>) elements.
> 
> Yes.
> 
>>
>>>      if (!resctrl)
>>>          goto cleanup;
>>>
>>> +    if (virDomainResctrlMonDefParse(def, ctxt, node,
>>> +                                    VIR_RESCTRL_MONITOR_TYPE_CACHE,
>>> +                                    resctrl) < 0)
>>> +        goto cleanup;
>>> +
>>> +    if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) {
>>> +        ret = 0;
>>> +        goto cleanup;
>>> +    }
>>> +
>>
>> Moving the AllocIsEmpty check should be separate.
> 
> Got. Will be done.
> 
>>
>> I'm losing steam, but the next couple of patches had Coverity issues, so I figured
>> I'll note that before going back to read all the comments you've posted today
>> while I was reviewing without trying to go back.
>>
>> John
> 
> I use a lot of paragraph in introducing the usage of 'default monitor',
> 'default allocation' and 'monitor in non-default allocation', and states that
> these 'default' are necessary for purpose of saving RMIDs, removing
> allocation duplications and keeping the flexibility of monitor that kernel
> interface provided. Hope I tell them clearly.

Yes, lots of text, perhaps way too much. Still I'm no closer to figuring
out what the need for this default wording/mechnism really is.

John


> 
> Thanks for review.
> Huaqiang
> 
> 
>>
>>>      if (virDomainResctrlAppend(def, node, resctrl, flags) < 0)
>>>          goto cleanup;
>>>
>>> @@ -27085,12 +27255,42 @@
>> virDomainCachetuneDefFormatHelper(unsigned
>>> int level,
>>>
>>>
>>>  static int
>>> +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr
>> domresmon,
>>> +                                   virResctrlMonitorType tag,
>>> +                                   virBufferPtr buf) {
>>> +IIUC.    char *vcpus = NULL;
>>> +    unsigned int level = 0;
>>> +
>>> +    if (domresmon->tag != tag)
>>> +        return 0;
>>> +
>>> +    virBufferAddLit(buf, "<monitor ");
>>> +
>>> +    if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) {
>>> +        level = virResctrlMonitorGetCacheLevel(domresmon->instance);
>>> +        virBufferAsprintf(buf, "level='%u' ", level);
>>> +    }
>>> +
>>> +    vcpus = virBitmapFormat(domresmon->vcpus);
>>> +    if (!vcpus)
>>> +        return -1;
>>> +
>>> +    virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus);
>>> +
>>> +    VIR_FREE(vcpus);
>>> +    return 0;
>>> +}
>>> +
>>> +
>>> +static int
>>>  virDomainCachetuneDefFormat(virBufferPtr buf,
>>>                              virDomainResctrlDefPtr resctrl,
>>>                              unsigned int flags)  {
>>>      virBuffer childrenBuf = VIR_BUFFER_INITIALIZER;
>>>      char *vcpus = NULL;
>>> +    size_t i = 0;
>>>      int ret = -1;
>>>
>>>      virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13
>>> @@ virDomainCachetuneDefFormat(virBufferPtr buf,
>>>                                      &childrenBuf) < 0)
>>>          goto cleanup;
>>>
>>> +    for (i = 0; i < resctrl->nmonitors; i ++) {
>>> +        if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i],
>>> +                                               VIR_RESCTRL_MONITOR_TYPE_CACHE,
>>> +                                               &childrenBuf) < 0)
>>> +            goto cleanup;
>>> +    }
>>> +
>>>      if (virBufferCheckError(&childrenBuf) < 0)
>>>          goto cleanup;
>>>
>>> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index
>>> e30a4b2..60f6464 100644
>>> --- a/src/conf/domain_conf.h
>>> +++ b/src/conf/domain_conf.h
>>> @@ -2236,12 +2236,23 @@ struct _virDomainCputune {  };
>>>
>>>
>>> +typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef;
>>> +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; struct
>>> +_virDomainResctrlMonDef {
>>> +    virBitmapPtr vcpus;
>>> +    virResctrlMonitorType tag;
>>> +    virResctrlMonitorPtr instance;
>>> +};
>>> +
>>>  typedef struct _virDomainResctrlDef virDomainResctrlDef;  typedef
>>> virDomainResctrlDef *virDomainResctrlDefPtr;
>>>
>>>  struct _virDomainResctrlDef {
>>>      virBitmapPtr vcpus;
>>>      virResctrlAllocPtr alloc;
>>> +
>>> +    virDomainResctrlMonDefPtr *monitors;
>>> +    size_t nmonitors;
>>>  };
>>>
>>>
>>> diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> b/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> index 9718f06..9f4c139 100644
>>> --- a/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> @@ -8,9 +8,12 @@
>>>      <cachetune vcpus='0-1'>
>>>        <cache id='0' level='3' type='code' size='7680' unit='KiB'/>
>>>        <cache id='1' level='3' type='data' size='3840' unit='KiB'/>
>>> +      <monitor level='3' vcpus='0'/>
>>> +      <monitor level='3' vcpus='1'/>
>>>      </cachetune>
>>>      <cachetune vcpus='2'>
>>>        <cache id='1' level='3' type='code' size='6' unit='MiB'/>
>>> +      <monitor level='3' vcpus='2'/>
>>>      </cachetune>
>>>      <cachetune vcpus='3'>
>>>        <cache id='1' level='3' type='data' size='6912' unit='KiB'/>
>>> diff --git
>>> a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>> b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>> new file mode 100644
>>> index 0000000..d481fb5
>>> --- /dev/null
>>> +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>> @@ -0,0 +1,30 @@
>>> +<domain type='qemu'>
>>> +  <name>QEMUGuest1</name>
>>> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
>>> +  <memory unit='KiB'>219136</memory>
>>> +  <currentMemory unit='KiB'>219136</currentMemory>
>>> +  <vcpu placement='static'>4</vcpu>
>>> +  <cputune>
>>> +    <cachetune vcpus='0-1'>
>>> +      <cache id='0' level='3' type='both' size='768' unit='KiB'/>
>>> +      <monitor level='3' vcpus='2'/>
>>> +    </cachetune>
>>> +  </cputune>
>>> +  <os>
>>> +    <type arch='i686' machine='pc'>hvm</type>
>>> +    <boot dev='hd'/>
>>> +  </os>
>>> +  <clock offset='utc'/>
>>> +  <on_poweroff>destroy</on_poweroff>
>>> +  <on_reboot>restart</on_reboot>
>>> +  <on_crash>destroy</on_crash>
>>> +  <devices>
>>> +    <emulator>/usr/bin/qemu-system-i686</emulator>
>>> +    <controller type='usb' index='0'/>
>>> +    <controller type='ide' index='0'/>
>>> +    <controller type='pci' index='0' model='pci-root'/>
>>> +    <input type='mouse' bus='ps2'/>
>>> +    <input type='keyboard' bus='ps2'/>
>>> +    <memballoon model='virtio'/>
>>> +  </devices>
>>> +</domain>
>>> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml
>>> b/tests/genericxml2xmlindata/cachetune-small.xml
>>> index ab2d9cf..748be08 100644
>>> --- a/tests/genericxml2xmlindata/cachetune-small.xml
>>> +++ b/tests/genericxml2xmlindata/cachetune-small.xml
>>> @@ -7,6 +7,13 @@
>>>    <cputune>
>>>      <cachetune vcpus='0-1'>
>>>        <cache id='0' level='3' type='both' size='768' unit='KiB'/>
>>> +      <monitor level='3' vcpus='0'/>
>>> +      <monitor level='3' vcpus='1'/>
>>> +      <monitor level='3' vcpus='0-1'/>
>>> +    </cachetune>
>>> +    <cachetune vcpus='2-3'>
>>> +      <monitor level='3' vcpus='2'/>
>>> +      <monitor level='3' vcpus='3'/>
>>>      </cachetune>
>>>    </cputune>
>>>    <os>
>>> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c
>>> index fa941f0..4393d44 100644
>>> --- a/tests/genericxml2xmltest.c
>>> +++ b/tests/genericxml2xmltest.c
>>> @@ -137,6 +137,8 @@ mymain(void)
>>>                   TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>>      DO_TEST_FULL("cachetune-colliding-types", false, true,
>>>                   TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>> +    DO_TEST_FULL("cachetune-colliding-monitor", false, true,
>>> +                 TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>>      DO_TEST("memorytune");
>>>      DO_TEST_FULL("memorytune-colliding-allocs", false, true,
>>>                   TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>>
> 




More information about the libvir-list mailing list