On Mon, May 11, 2015 at 07:24:47PM +1000, Tony Breeds wrote:
On Mon, May 11, 2015 at 10:52:17AM +0200, Martin Kletzander wrote:On Mon, May 11, 2015 at 10:52:46AM +1000, Tony Breeds wrote: > >Hello all, > This is with reference to OpenStack (nova) bug 1439280 . > >The symptom is when nova(-compute) tries to launch an instance/VM it errors out with: > "libvirtError: Requested operation is not valid: cpu affinity is not supported" >This only happens with using qemu in TCG mode. > I can see where the error might have come from. QEMU doesn't have vcpu threads if ran in TCG mode, so if libvirt tries to pin some of them, this will happen. *But* upstream handles this perfectly. I just tried and it works. I'll explain why a few lines down the road.Okay, that's good to know.>After looking at the domain XML and the docs at  It seems to me the problem is either: >1) the libvirt driver in nova is generating valid XML that is an invalid > domain description ; or >2) NUMA support in qemu (TCG) mode is broken. > I don't get what this has to do with NUMA, but anyway I just think this is a bug in older libvirt actually.The NUMA references are baddness on my part. They come from the fact that the code in nova that does the CPUpinning is tied up with NUMA support. So with that on mind I'd reword my summary as: "CPU Pinning in qemu (TCG) mode is broken."
Oh, OK, I thought there is some connection I must've missed :)
Yes, that's because we can't differentiate which threads do what with TCG accel. Because of the you must only specify one pinning for all threads (I/O threads, CPU threads and the emulator thread), which is done exactly using the 'cpuset' attribute as that is valid for the whole machine. Thatnsk to that we don't have to differentiate anything and just use the cpuset for the whole machine -> no error should occur there.Ok I think I follow that.>Have I understood the documentation correctly? If so it would seem that the >correct fix is in nova to teach the libvirt driver to generate the correct XML >for this virtulisation type. > You understood the documentation correctly, it was just libvirt acting up a bit.Okay that good to know. So we can work around this in nova by either: a) Check for libvirt 1.2.15 in the approriate TCG code path and elide the cpuset ; or b) including 'emulatorpin' in the cputune node 'a' is easy to do and is basically what I have proposwed in the bug. Having said that which is the better way to work around libvirt acting up a bit in older versions?
Determining this by version might not be reliable, but more importantly working around bug in underlying software is something that shouldn't be done at all IMHO. Let the maintainers backport whatever needs to be done.
Ther nova side will be pretty easy regardless. I'd say the best solution would be to back port the 'fix' but that seems like a lot of effort given the number of distros and libvirt versions potentiall involved.
If you want the fix to be distro-agnostic, there's nothing easier than back-porting the fix into our upstream maintenance branches. Those should make the life of distro maintainers easy (although it looks like not many distros use it). Having said that I'm not sure which commit(s) are those that need to be back-ported. Having known your libvirt version, it shouldn't be too hard looking for the differences and finding the right commit. When back-porting request is made on the list, it is usually acted upon. If you can't find the exact commit, let me know and I'll do my best to help.
I'm still not sure why are you bringing NUMA into the mix as there is nothing NUMA related in the bug or this mail. But to complete my answer, even though there is no practical benefit to using NUMA on TCG accelerated QEMU machine, it should still work.Okay so as I said above NUMA as such is a mistake. Thank tyou for confirming that when the correct XML is generated the instance/guest/VM works just fine. Yours Tony.
Description: PGP signature