[libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring Technology (CMT)

Wang, Huaqiang huaqiang.wang at intel.com
Wed Jul 18 02:29:32 UTC 2018



> -----Original Message-----
> From: Martin Kletzander [mailto:mkletzan at redhat.com]
> Sent: Tuesday, July 17, 2018 5:11 PM
> To: Wang, Huaqiang <huaqiang.wang at intel.com>
> Cc: libvir-list at redhat.com; Feng, Shaohe <shaohe.feng at intel.com>; Niu, Bing
> <bing.niu at intel.com>; Ding, Jian-feng <jian-feng.ding at intel.com>; Zang, Rui
> <rui.zang at intel.com>
> Subject: Re: [libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring
> Technology (CMT)
> 
> On Tue, Jul 17, 2018 at 07:19:41AM +0000, Wang, Huaqiang wrote:
> >Hi Martin,
> >
> >Thanks for your comments. Please see my reply inline.
> >
> >> -----Original Message-----
> >> From: Martin Kletzander [mailto:mkletzan at redhat.com]
> >> Sent: Tuesday, July 17, 2018 2:27 PM
> >> To: Wang, Huaqiang <huaqiang.wang at intel.com>
> >> Cc: libvir-list at redhat.com; Feng, Shaohe <shaohe.feng at intel.com>;
> >> Niu, Bing <bing.niu at intel.com>; Ding, Jian-feng
> >> <jian-feng.ding at intel.com>; Zang, Rui <rui.zang at intel.com>
> >> Subject: Re: [libvirt] [RFC PATCHv2 00/10] x86 RDT Cache Monitoring
> >> Technology (CMT)
> >>
> >> On Mon, Jul 09, 2018 at 03:00:48PM +0800, Wang Huaqiang wrote:
> >> >
> >> >This is the V2 of RFC and the POC source code for introducing x86
> >> >RDT CMT feature, thanks Martin Kletzander for his review and
> >> >constructive suggestion for V1.
> >> >
> >> >This series is trying to provide the similar functions of the perf
> >> >event based CMT, MBMT and MBML features in reporting cache
> >> >occupancy, total memory bandwidth utilization and local memory
> >> >bandwidth utilization information in livirt. Firstly we focus on cmt.
> >> >
> >> >x86 RDT Cache Monitoring Technology (CMT) provides a medthod to
> >> >track the cache occupancy information per CPU thread. We are
> >> >leveraging the implementation of kernel resctrl filesystem and
> >> >create our patches on top of that.
> >> >
> >> >Describing the functionality from a high level:
> >> >
> >> >1. Extend the output of 'domstats' and report CMT inforamtion.
> >> >
> >> >Comparing with perf event based CMT implementation in libvirt, this
> >> >series extends the output of command 'domstat' and reports cache
> >> >occupancy information like these:
> >> ><pre>
> >> >[root at dl-c200 libvirt]# virsh domstats vm3 --cpu-resource
> >> >Domain: 'vm3'
> >> >  cpu.cacheoccupancy.vcpus_2.value=4415488
> >> >  cpu.cacheoccupancy.vcpus_2.vcpus=2
> >> >  cpu.cacheoccupancy.vcpus_1.value=7839744
> >> >  cpu.cacheoccupancy.vcpus_1.vcpus=1
> >> >  cpu.cacheoccupancy.vcpus_0,3.value=53796864
> >> >  cpu.cacheoccupancy.vcpus_0,3.vcpus=0,3
> >> ></pre>
> >> >The vcpus have been arragned into three monitoring groups, these
> >> >three groups cover vcpu 1, vcpu 2 and vcpus 0,3 respectively. Take
> >> >an example, the 'cpu.cacheoccupancy.vcpus_0,3.value' reports the
> >> >cache occupancy information for vcpu 0 and vcpu 3, the
> >> 'cpu.cacheoccupancy.vcpus_0,3.vcpus'
> >> >represents the vcpu group information.
> >> >
> >> >To address Martin's suggestion "beware as 1-4 is something else than
> >> >1,4 so you need to differentiate that.", the content of 'vcpus'
> >> >(cpu.cacheoccupancy.<groupname>.vcpus=xxx) has been specially
> >> >processed, if vcpus is a continous range, e.g. 0-2, then the output
> >> >of cpu.cacheoccupancy.vcpus_0-2.vcpus will be like
> >> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0,1,2'
> >> >instead of
> >> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0-2'.
> >> >Please note that 'vcpus_0-2' is a name of this monitoring group,
> >> >could be specified any other word from the XML configuration file or
> >> >lively changed with the command introduced in following part.
> >> >
> >>
> >> One small nit according to the naming (but it shouldn't block any
> >> reviewers from reviewing, just keep this in mind for next version for
> >> example) is that this is still inconsistent.
> >
> >OK.  I'll try to use words such as 'cache', 'cpu resource' and avoid
> >using 'RDT', 'CMT'.
> >
> 
> Oh, you misunderstood, I meant the naming in the domstats output =)
> 
> >The way domstats are structured when there is something like an
> >> array could shed some light into this.  What you suggested is really
> >> kind of hard to parse (although looks better).  What would you say to
> something like this:
> >>
> >>   cpu.cacheoccupancy.count = 3
> >>   cpu.cacheoccupancy.0.value=4415488
> >>   cpu.cacheoccupancy.0.vcpus=2
> >>   cpu.cacheoccupancy.0.name=vcpus_2
> >>   cpu.cacheoccupancy.1.value=7839744
> >>   cpu.cacheoccupancy.1.vcpus=1
> >>   cpu.cacheoccupancy.1.name=vcpus_1
> >>   cpu.cacheoccupancy.2.value=53796864
> >>   cpu.cacheoccupancy.2.vcpus=0,3
> >>   cpu.cacheoccupancy.2.name=0,3
> >>
> >
> >Your arrangement looks more reasonable, thanks for your advice.
> >However, as I mentioned in another email that I sent to libvirt-list
> >hours ago, the kernel resctrl interface provides cache occupancy
> >information for each cache block for every resource group.
> >Maybe we need to expose the cache occupancy for each cache block.
> >If you agree, we need to refine the 'domstats' output message, how
> >about this:
> >
> >  cpu.cacheoccupancy.count=3
> >  cpu.cacheoccupancy.0.name=vcpus_2
> >  cpu.cacheoccupancy.0.vcpus=2
> >  cpu.cacheoccupancy.0.block.count=2
> >  cpu.cacheoccupancy.0.block.0.bytes=5488
> >  cpu.cacheoccupancy.0.block.1. bytes =4410000
> >  cpu.cacheoccupancy.1.name=vcpus_1
> >  cpu.cacheoccupancy.1.vcpus=1
> >  cpu.cacheoccupancy.1.block.count=2
> >  cpu.cacheoccupancy.1.block.0. bytes =7839744
> > cpu.cacheoccupancy.1.block.0. bytes =0
> >  cpu.cacheoccupancy.2.name=0,3
> >  cpu.cacheoccupancy.2.vcpus=0,3
> >  cpu.cacheoccupancy.2.block.count=2
> >  cpu.cacheoccupancy.2.block.0. bytes=53796864
> > cpu.cacheoccupancy.2.block.1. bytes=0
> >
> 
> What do you mean by cache block?  Is that (cache_size / granularity)?  In that
> case it looks fine, I guess (without putting too much thought into it).

No. 'cache block' that I mean is indexed with 'cache id', with the id number 
kept in '/sys/devices/system/cpu/cpu*/cache/index*/id'. 

Generally for a two socket server  node, there are two sockets (with CPU 
E5-2680 v4, for example) in system, and each socket has a L3 cache, 
if resctrl monitoring group is created (/sys/fs/resctrl/p0, for example), 
you can find the cache occupancy information for these two L3 cache
areas separately from file 
/sys/fs/resctrl/p0/mon_data/mon_L3_00/llc_occupancy
and file
/sys/fs/resctrl/p0/mon_data/mon_L3_01/llc_occupancy
Cache information for individual socket is meaningful to detect performance
issues such as workload balancing...etc. We'd better expose these details to
libvirt users. 
To my knowledge, I am using 'cache block' to describe the CPU cache
indexed with number found in '/sys/devices/system/cpu/cpu*/cache/index*/id'.
I welcome suggestion on other kind of naming for it. 

> 
> Martin




More information about the libvir-list mailing list