[libvirt] How should libvirt apps enable virtio-pci for aarch64?

Laine Stump laine at laine.org
Mon Dec 7 17:26:50 UTC 2015


On 12/07/2015 10:33 AM, Cole Robinson wrote:
> On 12/07/2015 03:19 AM, Pavel Fedin wrote:
>>   Hello!
>>
>>> - The PCIe controller XML is:
>>>      <controller type='pci' index='0' model='pcie-root'/>
>>>      <controller type='pci' index='1' model='dmi-to-pci-bridge'/>
>>>      <controller type='pci' index='2' model='pci-bridge'/>
>>> I have no idea if that's always going to be the expected XML, maybe it's not
>>> wise to hardcode that in apps.
>>   Since we are discussing this, i have a question.
>>   Why do we construct exactly this thing? What is "dmi-to-pci-bridge" and why do we need it?

A dmi-to-pci-bridge  plugs into a PCIe port (on real hardware at least, 
it isn't allowed in a normal PCI slot) and provides standard PCI slots 
downstream. So it is a way to convert from PCIe to PCI. However, you 
can't hot-plug devices into these standard PCI slots, which is why a 
pci-bridge is plugged into one of the slots of the dmi-to-pci-bridge - 
to provide standard PCI slots that can accept hot-plugged devices (which 
is what most management apps expect to be available).

There was considerable discussion about this when support for the Q35 
chipset was added, and this bus structure was directly taken from the 
advice I was given by qemu/pci people. (At the time I was concerned 
about whether or not it should be allowed to plug standard PCI devices 
into PCIe slots and vice versa (since that is physically not possible on 
real hardware); we've since learned that qemu doesn't have much problem 
with this in most cases, and I've loosened up restrictions in libvirt 
(auto-assign will match types, but you can force most endpoint devices 
into any PCI or PCIe slot you like.)

For AARCH64, though... well, if you want to know why it's added for that 
machinetype, I guess you'd need to talk to the person who turned on 
addPCIeRoot for AARCH64 :-). I actually wondered about that recently 
when I was tinkering with auto-adding USB2 controllers when machinetype 
is Q35 (i.e. "why are we adding an Intel/x86-specific controller to 
AARCH64 machines?")

(BTW, the name "dmi-to-pci-bridge" was chosen specifically to *not* be 
platform-specific. It happens that currently the only example of this 
type of controller is i82801b11-bridge (as can be seen in the xml here:

    <model name='i82801b11-bridge'/>

but if some other pci controller in the future behaves in the same way, 
it could also be classified as a dmi-to-pci-bridge (with corresponding 
different <model name...).)

>> I guess this is something PC-specific,
>> may be this layout has been copied from some real PC model, but i don't see any practical sense in it.
> This is likely just a side effect of the libvirt code requesting PCIe for
> aarch64, but the original PCIe support was added for the x86 q35 layout.

Correct. the "addPCIeRoot" bool, created only with thought to the Q35 
bus structure was overloaded to also create the other controllers, and 
that detail was missed when support for pcie-root on aarch64 virt 
machinetypes was added.

>
>>   Also, there are two problems with "pci-bridge":
>> 1. Live migration of this thing in qemu is broken. After migration the bridge screws up, lspci says "invalid header", and i don't
>> know whether it actually works because i never attach anything behind it, because of (2).
> I didn't even know aarch64 migration was working...

Is this only a problem on aarch64, or is there a migration problem with 
pci-bridge on x86 as well? (It's possible there is but it hasn't been 
noticed, because pci-bridge likely isn't used much outside of q35 
machines, and q35 was prohibited from migrating until qemu 2.4 due to 
the embedded sata driver which didn't support migration.)


>
>> 2. After pcie-root we have PCI-X, which supports MSI-X. And after pci-bridge we seem to have a plain PCI, which supports only plain
>> MSI. The problem here is that virtio seems to work only with MSI-X in any advanced mode (multiqueue, vhost, etc). If i place it
>> behind the bridge (and default libvirt's logic is to place the device there), MSI-X will not work.

libvirt's PCI address allocation logic can certainly be changed, as long 
as it's done in a way that won't break auto-allocation for any older 
machinetypes. For example, we could make virtio devices prefer to be 
plugged into a PCIe port (probably a pcie-downstream-switch-port or 
pcie-root-port).

BTW, does the aarch64 virt machinetype support any controller aside from 
the embedded pcie-root? Normally the ports on pcie-root don't support 
hotplug directly, but need a pcie-root-port (or a set of switch ports) 
plugged into them at boot time. The only examples of these types of 
controllers that libvirt knows about are based on Intel chips (ioh3420, 
x3130-whatever).


>>   The same applies to passthrough
>> VFIO devices. This is especially painful because on real-life ARM64 platforms builtin hardware seems to mandate MSI-X. For example
>> on ThunderX NIC driver simply does not support anything except MSI-X.
>>
> Maybe this is just a factor of libvirt specifying the wrong bits on the
> aarch64 command line. Do you have a working qemu commandline outside of libvirt?


Similar question - is this problem present on x86 as well?


>>> * Next idea: Users specify something like like <address type='pci'/> and
>>> libvirt fills in the address for us.
>>   I like this one, and IMHO this would be nice to have regardless of the default. Manual assignment of PCI layout is a tedious
>> process, which is not always necessary. I think it is quite logical to allow the user just to say: "I want this device on the PCI
>> bus", and do the rest for him.
> Agreed, I'll look into it in addition to the user PCI controller bits.

Right now we will auto-add only a pci-bridge if no available slot is 
found for a pci device, but we will (should anyway) auto-assign a slot 
on an *existing* PCIe controller if the device has PCIE as the preferred 
slot type. It would be really cool if the "current machinetype" had a 
"preferred controller type" for pci devices with no manually assigned 
address, and would auto-add the appropriate controller according to that 
(with parents auto-added as necessary).





More information about the libvir-list mailing list