[libvirt] [PATCH v2 0/4] Add cpu hotplug support to libvirt.

Tang Chen tangchen at cn.fujitsu.com
Thu Sep 6 03:27:32 UTC 2012


Hi Daniel,

On 09/05/2012 05:43 PM, Daniel P. Berrange wrote:
>
> Your patch appears to work in some limited scenarios, but more
> generally it will fail to work, and resulted in undesirable
> behaviour.
>
> Consider for example, if libvirtd is configured thus:
>
>    cd /sys/fs/cgroup/cpuset
>    mkdir demo
>    cd demo
>    echo 2-3>  cpuset.cpus
>    echo 0>  cpuset.mems
>    echo $$>  tasks
>    /usr/sbin/libvirtd
>
> ie, libvirtd is now running on cpus 2-3, in group 'demo'. VMs will
> be created in
>
>    /sys/fs/cgroup/cpuset/demo/libvirt/qemu/$VMNAME
>
> Your patch attempts to set the cpuset.cpus on 'libvirt/qemu/$VMNAME'
> but ignores the fact that there could be many higher directories
> (eg demo here) that need setting. libvirtd, however, should not be
> responsible for / allowed to change settings in parent cgroups from
> where it was started.  ie in this example, libvirtd should *not*
> touch the 'demo' cgroup.
>
Yes, I didn't realize this situation. Thanks for remind me. :)

> So consider systemd starting tasks, giving them custom cgroups.
> Now systemd also has to listen for netlink events and reset the
> cpuset masks.
>
> Things are even worse if the admin has temporarily offlined all the
> cpus that are associated with the current cpuset. When this happens
> the kernel throws libvirtd and all its VMs out of their current
> cgroups and dumps them up in a parent cgroup (potentially even the
> root group). This is really awful.
>
Agreed. :)

>
> IMHO, execution of those tasks should simply be paused (same way that
> the 'freezer' cgroup pauses tasks). The admin can then either move
> the tasks to an alternate cgroup, or change the cpuset mask to allow
> them to continue running.
>
> The kernel's current behaviour of pushing all tasks up into a parent
> cgroup is just crazy - it is just throwing away the users requested
> cpu mask forever :-(
>
>> If I want to solve the start failure problem, what should I do ?
>
> I maintain the problems we see with cpuset controller cannot be reasonably
> solved by libvirtd, or userspace in general. The kernel behaviour is just
> flawed. If the kernel won't fix it, then we should recommend people not
> to use the cpuset cgroup at all, and just rely on our sched_setaffinity
> support instead.

I like the sched_setaffinity idea. Let's just temporarily shut off
cpuset cgroup in libvirt, shall we ?

Since cpuset cgroup was turned on when I was working on the emulator-pin
job, I will shut if off and improve all these with sched_setaffinity().

And I will send new patches soon. Thanks. :)

>
> Daniel




More information about the libvir-list mailing list