[vfio-users] VFIO-PCI with AARCH64 QEMU

Haynal, Steve Steve_Haynal at mentor.com
Thu Oct 27 21:32:37 UTC 2016


Hi Laszlo,

Thanks for the detailed response. I'm learning quite a bit. Patching the kernel with an unaccepted patch is a little more bleeding edge than I can be for this project. I did hack the driver to not depend on /sys/bus/pci resource0 for mmap. I am now able to encode yuv video to compressed VP9 on the FPGA from an emulated aarch64 guest! The performance hit for the emulated aarch64 wasn't as bad as I thought it would be. Depending on how I measure time, (portions of the test are the same as they run on the same FPGA in emulated aarch64 or kvm x86), the penalty is between 1.6X to 2.3X for the emulated aarch64 vs kvm x86.

Best Regards,

Steve Haynal
  

-----Original Message-----
From: Laszlo Ersek [mailto:lersek at redhat.com] 
Sent: Thursday, October 27, 2016 12:28 AM
To: Haynal, Steve; Ard Biesheuvel
Cc: Alex Williamson; vfio-users at redhat.com; Eric Auger
Subject: Re: [vfio-users] VFIO-PCI with AARCH64 QEMU

On 10/27/16 02:24, Haynal, Steve wrote:
> Hi All,
> 
> I was able to enable both memory regions but my test program did not 
> work on aarch64 as it does on x86. The driver is an UIO driver and it 
> fails when it can't find resource0 in 
> /sys/bus/pci/devices/0000:00:09.0. In the x86 guest, I see resource0 
> and resource1 in that directory. In the aarch64 guest, there is no 
> resourceN. Is this related?
> http://stackoverflow.com/questions/38921463/linux-kernel-4-7-arch-arm6
> 4-does-not-create-resource0-file-in-sys-bus-pci-d

It seems related, yes. This is the (partial) call stack that creates the resource%d files:

  pci_create_resource_files() [drivers/pci/pci-sysfs.c]
    pci_create_attr()         [drivers/pci/pci-sysfs.c]

However, if the platform doesn't define HAVE_PCI_MMAP, then
pci_create_resource_files() does nothing.

See also in "Documentation/filesystems/sysfs-pci.txt" (rewrapped here):

> Supporting PCI access on new platforms
> --------------------------------------
>
> In order to support PCI resource mapping as described above, Linux 
> platform code must define HAVE_PCI_MMAP and provide a 
> pci_mmap_page_range function. Platforms are free to only support 
> subsets of the mmap functionality, but useful return codes should be 
> provided.

While "arch/arm/include/asm/pci.h" defines HAVE_PCI_MMAP, and declares the pci_mmap_page_range() function, "arch/arm64/include/asm/pci.h" does neither.

The patch linked in the stackoverflow question aimed to add this functionality (it seems), but apparently it hasn't been accepted.

You can find the discussion here:

[PATCH v2] arm64: pci: add support for pci_mmap_page_range http://www.spinics.net/lists/arm-kernel/msg496915.html

The problem seems to be that defining HAVE_PCI_MMAP exposes two sets of pseudo-files, one set under sysfs, and another set under /proc/bus/pci/.
The latter is considered legacy / deprecated / ugly, and should be avoided (apparently), but for that, the generic PCI code will have to be
refactored:

http://www.spinics.net/lists/arm-kernel/msg498024.html

- From: Arnd Bergmann <arnd at xxxxxxxx>
- Date: Mon, 18 Apr 2016 17:00:49 +0200

> The problem is that once we allow mmap() on proc/bus/pci/*/*, it 
> becomes much harder to prove that we are able to remove it again 
> without breaking stuff that worked.
>
> We have to decouple the sysfs interface from the procfs interface 
> before we allow the former.

On 10/27/16 02:24, Haynal, Steve wrote:
> Any ideas on how I can have resource0 and resource1 populated? 

The kernel feature looks incomplete ATM in arm64.

> The memory regions still show up as disabled and I must enable them with setpci.

I think that's because neither the firmware nor the kernel has a driver for this device. Minimally in UEFI, it is the given UEFI_DRIVER's responsibility to toggle mem/io decoding in the command register of the device when the driver binds the device. No driver -- no decoding enabled. I assume it's the same for the probe functions of kernel device drivers.

> The pci-related portion of the kernel log is below as well as lspci 
> output. I updated to kernel 4.8.4-040804. I see the same behavior with 
> 4.4.0 or 4.8.4.
> 
> [    5.473692] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> [    5.473848] pciehp: PCI Express Hot Plug Controller Driver version: 0.4
> [    5.475878] OF: PCI: host bridge /pcie at 10000000 ranges:
> [    5.476320] OF: PCI:    IO 0x3eff0000..0x3effffff -> 0x00000000
> [    5.476616] OF: PCI:   MEM 0x10000000..0x3efeffff -> 0x10000000
> [    5.476678] OF: PCI:   MEM 0x8000000000..0xffffffffff -> 0x8000000000
> [    5.477429] pci-host-generic 3f000000.pcie: ECAM at [mem 0x3f000000-0x3fffffff] for [bus 00-0f]
> [    5.479081] pci-host-generic 3f000000.pcie: PCI host bridge to bus 0000:00
> [    5.479354] pci_bus 0000:00: root bus resource [bus 00-0f]
> [    5.479460] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> [    5.479496] pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff]
> [    5.479524] pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff]
> [    5.480416] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000
> [    5.483773] pci 0000:00:09.0: [10ee:7022] type 00 class 0x058000
> [    5.484100] pci 0000:00:09.0: reg 0x10: [mem 0x10800000-0x10800fff]
> [    5.484163] pci 0000:00:09.0: reg 0x14: [mem 0x10000000-0x107fffff]
> [    5.488027] pci 0000:00:09.0: BAR 1: assigned [mem 0x10000000-0x107fffff]
> [    5.488274] pci 0000:00:09.0: BAR 0: assigned [mem 0x10800000-0x10800fff]

Looks good to me.

> 
> 
> lspci -vvv
> 00:00.0 Host bridge: Red Hat, Inc. Device 0008
> 	Subsystem: Red Hat, Inc Device 1100
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> 
> 00:09.0 Memory controller: Xilinx Corporation Device 7022
> 	Subsystem: Xilinx Corporation Device 0007
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Interrupt: pin A routed to IRQ 47
> 	Region 0: Memory at 10800000 (32-bit, non-prefetchable) [disabled] [size=4K]
> 	Region 1: Memory at 10000000 (32-bit, non-prefetchable) [disabled] [size=8M]
> 	Capabilities: [80] Power Management version 3
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit+
> 		Address: 0000000000000000  Data: 0000
> 	Capabilities: [c0] Express (v2) Root Complex Integrated Endpoint, MSI 00
> 		DevCap:	MaxPayload 512 bytes, PhantFunc 0
> 			ExtTag- RBE+
> 		DevCtl:	Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> 			MaxPayload 256 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> 		DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR-, OBFF Not Supported
> 		DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
> 	Capabilities: [100 v2] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> 		CEMsk:	RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-

If there was an active matching kernel driver / module, I think it would be listed here.

Thanks
Laszlo




More information about the vfio-users mailing list