[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 2/2] HACK: qemu: aarch64: Use virtio-pci if user specifies PCI controller

On 03/09/2016 09:54 AM, Daniel P. Berrange wrote:
On Wed, Mar 09, 2016 at 01:40:36PM +0100, Andrea Bolognani wrote:
On Fri, 2016-03-04 at 17:05 -0500, Laine Stump wrote:
I'm not sure I fully understand all of the above, but I'll pitch
in with my own proposal regardless :)
First, we make sure that <controller type='pci' index='0' model='pcie-root'/> is always added automatically to the domain XML when using the
mach-virt machine type. Then, if
<controller type='pci' index='1' model='dmi-to-pci-bridge'/>
     <controller type='pci' index='2' model='pci-bridge'/>
is present as well we default to virtio-pci, otherwise we use
the current default of virtio-mmio. This should allow management
applications, based on knowledge about the guest OS, to easily
pick between the two address schemes.
Does this sound like a good idea?
... or a variation of that, anyway :-)
What I think: If there are *any* pci controllers *beyond* pcie-root, or
if there are any devices that already have a PCI address, then assign
PCI addresses, else use mmio.
This sounds reasonable.

However, I'm wondering if we shouldn't be even more explicit about
this... From libvirt's point of view we just need to agree on some
sort of "trigger" that causes it to allocate PCI addresses instead
of MMIO addresses, and for that purpose "any PCI controller or any
device with a PCI address" is perfectly fine; looking at that from
the point of view of an user, though? Not sure about it.

What about adding something like

     <preference name="defaultAddressType" value="pci"/>

to the domain XML? AFAICT libvirt doesn't have any element that
could be used for this purpose at the moment, but maybe having a
generic way to set domain-wide preferences could be useful in
other situations as well?

Looking at this mail and laine's before I really get the impression we
are over-thinking this all. The automatic address assignment by libvirt
was written such that it matched what QEMU would have done, so that we
could introduce the concept of device addressing without breaking
existing apps which didn't supply addresses. The automatic addressing
logic certainly wasn't ever intended to be able to implement arbitrary

As a general rule libvirt has always aimed to stay out of the policy
business, and focus on providing the mechanism only. So when we start
talking about adding  <preferences> to the XML this is a sure sign
that we've gone too far into trying to implement policy in libvirt.

>From the POV of being able to tell libvirt to assign PCI instead of
MMIO addresses for the virt machine, I think it suffices to have
libvirt accept a simple  <address type="pci"/> element (ie with no
bus/slot/func filled in).

That's just a more limited case of the same type of feature creep though - you're specifying a preference for what type of address you want, just in a different place. (The odd thing is that we're adding it only to remedy what is apparently a temporary problem, and even then it's a problem that wouldn't exist if we just said simply this: "If virtio-mmio is specified, then use virtio-mmio; If no address is specified, use pci". This puts the burden on people insisting on the old / soon-to-be-deprecated (?) virtio-mmio address type to add a little something to the XML (and a very simple little something at that!). For a short time, this could cause problems for people who create new domains using qemu binaries that don't have a pci bus on aarch64/virt yet, but that will be easily solvable (by adding "<address type='virtio-mmio'/>", which is nearly as simple as typing "<address type='pci'/>").

A downside of setting your preference with <address type="pci"/> vs. having the preference in a separate element is that you can no longer cause a "re-addressing" of all the devices by simply removing all the <address> elements (and it's really that that got me thinking about adding hints/preferences in the <target> element). I don't know how important that is; I suppose when the producer of the XML is some other software it's a non-issue, when the producer is a human using virsh edit it would make life much simpler once in awhile (and the suggestion in the previous paragraph would eliminate that problem anyway).

So is there a specific reason why we need to keep virtio-mmio as the default, and require explicitly saying "address type='pci'" in order to get a PCI address?

If applictions care about assigning devices to specify PCI buses,
ie to distinguish between PCI & PCIe, or to pick a different bus
when there is one bus per NUMA node, then really it is time for
the application to start specifying the full <address> element
with all details, and not try to invent ever more complex policies
inside libvirt which will never deal with all application use cases.

First, I agree with your vigilance against unnecessary feature creep. Both because keeping things simple makes it easier to maintain and use, and also because it's very difficult (or even impossible) to change something once it's gone in, in the case that you have second thoughts.

But, expanding beyond just the aarch64/virt mmio vs. pci problem (which I had already done in this thread, and which is what got us into this "feature creep" discussion in the first place :-)...

I remember when we were first adding support for Q35 that we said it should be as easy to create a domain with Q35 machinetype as it is for 440fx, i.e. you should be able to do it without manually specifying any PCI addresses (that's why we have the strange default bus hierarchy, with a dmi-to-pci-bridge and a pci-bridge, and all devices going onto the pci-bridge). But of course 440fx is very simplistic - all slots are standard PCI and all are hotpluggable. On Q35 there could be room for choices though, in particular PCI vs PCIe and hotpluggable vs. not. (and yeah, also what NUMA node the bus is on; I agree that one is "getting out there"). So since there are choices, libvirt has by definition begun implementing policy when it auto-assigns *any* PCI address (current policy is "require all auto-assigned device addresses to be hotpluggable standard PCI). And if libvirt doesn't provide an easy way to choose, it will have implemented a defacto mandatory policy (unless you force full specification of PCI addresses, which goes against the "make it easy" goal).

And the current policy is looking (for Q35 and aarch64/virt at least) less and less like the correct one - in the years since it was put in, 1) we've learned that qemu doesn't care, and will not ever care, that a PCI device has been plugged into a PCIe slot, 2) we now have support for several PCIe controllers we didn't previously have, 3) people have started complaining that their devices are in PCI slots rather than PCIe, 4) even though it was stated at the time I wrote the code to auto-add a pci-bridge to every Q35 domain that pci-bridge's failure to support hotplug was a temporary bug, I just learned yesterday that it apparently still doesn't work, and 5) some platforms (e.g. our favorite aarch64/virt) are emulating a hardware platform that has *never* had standard PCI slots, only PCIe.

Beyond that, there is no place that provides a simple encoding of which type of controller provides which type of slot, and what is allowed to plug into what. If you require the management application/human to manually specify all the PCI addresses as soon as they have a need for one of these basic characteristics, then not only has it become cumbersome to define a domain (because the management app has to maintain a data structure to keep track of which PCI addresses are in use and which aren't), but it means that the management application also needs to know all sorts of rules about which PCI controllers are actually pcie vs. pci, and which accept hotplug devices vs. which don't, as well as things like the min/max slot number for each controller, and which ones can plug into where, e.g. a pcie-root-port can only plug into pcie-root or pcie-expander-bus, and a pcie-switch-downstream-port can only plug into a pcie-switch-upstream-port, etc. Requiring a management app to get all of that right just so that they can pick between a hotpluggable and non-hotpluggable slot seems like an overly large burden (and prone to error).

In the end, if libvirt makes it simple for the management app to specify what kind of slot it wants, rather than requiring it to specify the exact slot, then libvirt isn't making any policy decisions, it's just making it easier for the management app to implement its own policy decisions, without requiring the management app to know all the details about which controller implements what kind of connection etc, and that does seem to fit within libvirt's purpose.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]