[libvirt] [PATCH RESEND RFC v4 1/6] Introduce the function virCgroupForVcpu

Thu Jul 21 13:49:28 UTC 2011

On 07/21/2011 08:34 AM, Daniel P. Berrange wrote:
> On Thu, Jul 21, 2011 at 07:54:05AM -0500, Adam Litke wrote:
>> Added Anthony to give him the opportunity to address the finer points of
>> this one especially with respect to the qemu IO thread(s).
>>
>> This feature is really about capping the compute performance of a VM
>> such that we get consistent top end performance.  Yes, qemu has non-VCPU
>> threads that this patch set doesn't govern, but that's the point.  We
>> are not attempting to throttle IO or device emulation with this feature.
>>  It's true that an IO-intensive guest may consume more host resources
>> than a compute intensive guest, but they should still have equal top-end
>> CPU performance when viewed from the guest's perspective.
> 
> I could be mis-understanding, what you're trying to achieve,
> here, so perhaps we should consider an example.

>From your example, it's clear to me that you understand the use case well.

>  - A machine has 4 physical CPUs
>  - There are 4 guests on the machine
>  - Each guest has 2 virtual CPUs
> 
> So we've overcommit the host CPU resources x2 here.
> 
> Lets say that we want to use this feature to ensure consistent
> top end performance of every guest, splitting the host pCPUs
> resources evenly across all guests, so each guest is ensured
> 1 pCPU worth of CPU time overall.
> 
> This patch lets you do this by assigning caps per VCPU. So
> in this example, each VCPU cgroup would have to be configured
> to cap the VCPUs at 50% of a single pCPU.
> 
> This leaves the other QEMU threads uncapped / unaccounted
> for. If any one guest causes non-trivial compute load in
> a non-VCPU thread, this can/will impact the top-end compute
> performance of all the other guests on the machine.
> 
> If we did caps per VM, then you could set the VM cgroup
> such that the VM as a whole had 100% of a single pCPU.
> 
> If a guest is 100% compute bound, it can use its full
> 100% of a pCPU allocation in vCPU threads. If any other
> guest is causing CPU time in a non-VCPU thread, it cannot
> impact the top end compute performance of VCPU threads in
> the other guests.
> 
> A per-VM cap would, however, mean a guest with 2 vCPUs
> could have unequal scheduling, where one vCPU claimed 75%
> of the pCPU and the othe vCPU got left with only 25%.
> 
> So AFAICT, per-VM cgroups is better for ensuring top
> end compute performance of a guest as a whole, but
> per-VCPU cgroups can ensure consistent top end performance
> across vCPUs within a guest.
> 
> IMHO, per-VM cgroups is the more useful because it is the
> only way to stop guests impacting each other, but there
> could be additional benefits of *also* have per-VCPU cgroups
> if you want to ensure fairness of top-end performance across
> vCPUs inside a single VM.

What this says to me is that per-VM cgroups _in_addition_to_ per-vcpu
cgroups is the _most_ useful situation.  Since I can't think of any
cases where someone would want per-vm and not per-vcpu, how about we
always do both when supported.  We can still use one pair of tunables
(<period> and <quota>) and try to do the right thing.  For example:

<vcpus>2</vcpus>
<cputune>
  <period>500000</period>
  <quota>250000</quota>
</cputune>

Would have the following behavior for qemu-kvm (vcpu threads)

Global VM cgroup: cfs_period:500000 cfs_quota:500000
Each vcpu cgroup: cfs_period:500000 cfs_quota:250000

and this behavior for qemu with no vcpu threads

Global VM cgroup: cfs_period:500000 cfs_quota:500000

It's true that IO could still throw off the scheduling balance somewhat
among vcpus _within_ a VM, but this effect would be confined within the
vm itself.

Best of both worlds?

-- 
Adam Litke
IBM Linux Technology Center