[libvirt] [PATCH v2 0/4] Add cpu hotplug support to libvirt.
Tang Chen
tangchen at cn.fujitsu.com
Thu Sep 6 03:27:32 UTC 2012
Hi Daniel,
On 09/05/2012 05:43 PM, Daniel P. Berrange wrote:
>
> Your patch appears to work in some limited scenarios, but more
> generally it will fail to work, and resulted in undesirable
> behaviour.
>
> Consider for example, if libvirtd is configured thus:
>
> cd /sys/fs/cgroup/cpuset
> mkdir demo
> cd demo
> echo 2-3> cpuset.cpus
> echo 0> cpuset.mems
> echo $$> tasks
> /usr/sbin/libvirtd
>
> ie, libvirtd is now running on cpus 2-3, in group 'demo'. VMs will
> be created in
>
> /sys/fs/cgroup/cpuset/demo/libvirt/qemu/$VMNAME
>
> Your patch attempts to set the cpuset.cpus on 'libvirt/qemu/$VMNAME'
> but ignores the fact that there could be many higher directories
> (eg demo here) that need setting. libvirtd, however, should not be
> responsible for / allowed to change settings in parent cgroups from
> where it was started. ie in this example, libvirtd should *not*
> touch the 'demo' cgroup.
>
Yes, I didn't realize this situation. Thanks for remind me. :)
> So consider systemd starting tasks, giving them custom cgroups.
> Now systemd also has to listen for netlink events and reset the
> cpuset masks.
>
> Things are even worse if the admin has temporarily offlined all the
> cpus that are associated with the current cpuset. When this happens
> the kernel throws libvirtd and all its VMs out of their current
> cgroups and dumps them up in a parent cgroup (potentially even the
> root group). This is really awful.
>
Agreed. :)
>
> IMHO, execution of those tasks should simply be paused (same way that
> the 'freezer' cgroup pauses tasks). The admin can then either move
> the tasks to an alternate cgroup, or change the cpuset mask to allow
> them to continue running.
>
> The kernel's current behaviour of pushing all tasks up into a parent
> cgroup is just crazy - it is just throwing away the users requested
> cpu mask forever :-(
>
>> If I want to solve the start failure problem, what should I do ?
>
> I maintain the problems we see with cpuset controller cannot be reasonably
> solved by libvirtd, or userspace in general. The kernel behaviour is just
> flawed. If the kernel won't fix it, then we should recommend people not
> to use the cpuset cgroup at all, and just rely on our sched_setaffinity
> support instead.
I like the sched_setaffinity idea. Let's just temporarily shut off
cpuset cgroup in libvirt, shall we ?
Since cpuset cgroup was turned on when I was working on the emulator-pin
job, I will shut if off and improve all these with sched_setaffinity().
And I will send new patches soon. Thanks. :)
>
> Daniel
More information about the libvir-list
mailing list