[vfio-users] vfio-pci with amd gpu in kernel 4.14.10+

P. Pronk vfio at pronk.nl
Fri Jan 12 09:45:57 UTC 2018


Thanks for the tips, swapping the cards is not an option as I use a uatx
board but also use all the other pcie ports which would be blocked by
the rx480 otherwise. Adding the vfio-pci module to initramfs doesnt make
any difference unfortunately.
You cant set which gpu is the primary on an Asus board. It just uses the
one in the first pcie slot and you have no other choice, thats why I
said I made a mistake buying Asus ;)

But everything works as intended in v4.14.9, it 'just' doesnt anymore
since v4.14.10

@Alex, do you maybe have any suggestions? Unfortunately I am having some
difficulties understanding the changelog of kernel v4.14.10 and their
possible impact on my problem.

Cheers, Pim

On 11/01/18 21:43, Peter Maloney wrote:
> Ok, well they have unique ids, so you can use the cmdline like you
> were. (And my script runs in initramfs, and looks at the kernel
> cmdline, and doesn't care what motherboard you have.) And I don't know
> if it would really stop you because one is primary... I would think it
> would just go blank or stop updating the screen instead (and likely
> not work in a guest). I don't have much experience with trying to send
> through the primary gpu... I think that's a bad idea...why not just
> swap the cards?
>
> And I have a gigabyte board with a slight problem...it puts the
> bios/efi on the 3rd gpu and that gpu will hang the host if I put an r7
> in there, but works with an old HD. How do you set which one is primary?
>
> And regarding the loading amdgpu before vfio... you could make sure
> your initramfs has vfio-pci in it, and that shouldn't happen. Or you
> can test by blacklisting amdgpu (and rebuilding initramfs0 just to see
> if vfio-pci binds then.
>
> On 01/11/18 11:47, P. Pronk wrote:
>>
>>
>> Unfortunately I have to use the kernel cmdline syntax as I made a
>> mistake of buying an Asus motherboard. The RX480 is my primary/boot
>> gpu and you cant change that in the Asus bios like you can with eg
>> Gigabyte.
>>
>> Thinking about it, this is probably also the reason why vfio doesnt
>> bind anymore. Probably something changed in the kernel because of
>> which the primary/boot gpu cant be 'unloaded' anymore?
>>
>> Comparing the dmesg outputs, it looks like the amd driver is loaded
>> before vfio in 4.14.13, eg dmesg still lists that vfio is adding the
>> 67df device but the amdgpu driver is already loaded then. See
>> excerpts of dmesg in 4.14.13 below:
>>
>> [    0.208096] pci 0000:02:00.0: vgaarb: VGA device added:
>> decodes=io+mem,owns=none,locks=none
>> [    0.208096] pci 0000:01:00.0: vgaarb: setting as boot VGA device
>> [    0.208096] pci 0000:01:00.0: vgaarb: VGA device added:
>> decodes=io+mem,owns=io+mem,locks=none
>> [    0.208122] pci 0000:01:00.0: vgaarb: bridge control possible
>> [    0.208152] pci 0000:02:00.0: vgaarb: bridge control possible
>> [    0.208181] vgaarb: loaded
>> ...
>> [    0.276067] pci 0000:01:00.0: Video device with shadowed ROM at
>> [mem 0x000c0000-0x000dffff]
>> ...
>> [    4.584958] amdgpu 0000:01:00.0: Invalid PCI ROM header signature:
>> expecting 0xaa55, got 0xffff
>> [    4.585028] ATOM BIOS: 113-V34111-F1
>> [    4.585050] [drm] GPU post is not needed
>> [    4.585348] [drm] vm size is 64 GB, block size is 13-bit, fragment
>> size is 4-bit
>> [    4.585416] amdgpu 0000:01:00.0: VRAM: 8192M 0x000000F400000000 -
>> 0x000000F5FFFFFFFF (8192M used)
>> [    4.585461] amdgpu 0000:01:00.0: GTT: 256M 0x0000000000000000 -
>> 0x000000000FFFFFFF
>> [    4.585501] [drm] Detected VRAM RAM=8192M, BAR=256M
>> [    4.585525] [drm] RAM width 256bits GDDR5
>> [    4.585549] [drm] amdgpu: 8192M of VRAM memory ready
>> [    4.585572] [drm] amdgpu: 8192M of GTT memory ready.
>> [    4.585603] [drm] GART: num cpu pages 65536, num gpu pages 65536
>> [    4.585673] [drm] PCIE GART of 256M enabled (table at
>> 0x000000F400040000).
>> [    4.585719] [drm] Supports vblank timestamp caching Rev 2 (21.10.2
>> [    4.585799] amdgpu 0000:01:00.0: amdgpu: using MSI.
>> [    4.585832] [drm] amdgpu: irq initialized.
>> [    4.595990] usb 3-10: new full-speed USB device number 10 using
>> xhci_hcd
>> [    4.697313] amdgpu: [powerplay] amdgpu: powerplay sw initialized
>> [    4.697560] [drm] AMDGPU Display Connectors
>> [    4.697581] [drm] Connector 0:
>> ...
>> [    6.013088] amdgpu 0000:01:00.0: fb1: amdgpudrmfb frame buffer device
>> [    6.013235] amdgpu 0000:01:00.0: kfd not supported on this ASIC
>> [    6.013255] [drm] Initialized amdgpu 3.19.0 20150101 for
>> 0000:01:00.0 on minor 1
>> [    6.024444] scsi 0:0:0:0: Direct-Access     Generic  Ultra
>> HS-SD/MMC  1.82 PQ: 0 ANSI: 0
>> [    6.024627] sd 0:0:0:0: Attached scsi generic sg0 type 0
>> [    6.049487] sd 0:0:0:0: [sda] Attached SCSI removable disk
>> [    6.061022] VFIO - User Level meta-driver version: 0.3
>> [    6.064451] vfio_pci: add [1002:67df[ffff:ffff]] class
>> 0x000000/00000000
>> [    6.088016] vfio_pci: add [1002:aaf0[ffff:ffff]] class
>> 0x000000/00000000
>>
>> As you requested, the output of lspci (in 4.14.13). The 67ed is a
>> RX460 which I use for the host, the RX480 is for the guest.
>>
>> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
>> Inc. [AMD/ATI] Device [1002:67df] (rev c7)
>>         Subsystem: Micro-Star International Co., Ltd. [MSI] Device
>> [1462:3413]
>>         Kernel driver in use: amdgpu
>>         Kernel modules: amdgpu
>> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI]
>> Device [1002:aaf0]
>>         Subsystem: Micro-Star International Co., Ltd. [MSI] Device
>> [1462:aaf0]
>>         Kernel driver in use: vfio-pci
>>         Kernel modules: snd_hda_intel
>> 02:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
>> Inc. [AMD/ATI] Device [1002:67ef] (rev cf)
>>         Subsystem: XFX Pine Group Inc. Device [1682:9460]
>>         Kernel driver in use: amdgpu
>>         Kernel modules: amdgpu
>> 02:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI]
>> Device [1002:aae0]
>>         Subsystem: XFX Pine Group Inc. Device [1682:aae0]
>>         Kernel driver in use: snd_hda_intel
>>         Kernel modules: snd_hda_intel
>>
>>
>>
>> On 11/01/18 10:57, Peter Maloney wrote:
>>> Let's see an `lspci -knn` for each of those devices.
>>>
>>> I'm using 4.14.x and have 3 AMD gpus, one for the host, and 2 for
>>> VMs, and it works.
>>>
>>> But I don't use the pci-stub.ids or the vfio-pic.ids kernel cmdline
>>> syntax... I bind by pci address instead of vendor:device since
>>> they're non-unique (using my script here
>>> https://github.com/petermaloney/misc/blob/master/mkinitcpio-vfio-pci/hooks/vfio-pci).
>>> PCI address can change on firmware updates or moving cards around,
>>> but stays the same otherwise in my experience.
>>>
>>> On 01/11/18 10:24, P. Pronk wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> Is someone successfully using a kernel version of 4.14.10 or higher
>>>> with an AMD graphics card? It seems my RX480 vga controller (67df)
>>>> wont use the vfio-pci driver in 4.14.10+ anymore, even though the
>>>> RX480 audio device (aaf0) will. I have both
>>>> 'pci-stub.ids=1002:67df,1002:aaf0' listed in my grub cmdline as
>>>> 'options vfio-pci ids=1002:67df,1002:aaf0' in modprobe.d/vfio-pci.conf
>>>>
>>>> Did something change since 4.14.10? Checking the kernel changelogs
>>>> doesnt show anything (immediately) related to either vfio-pci or
>>>> the amd gpu kernel driver.
>>>>
>>>> To be clear, in kernel version 4.14.9 everything still works as
>>>> expected for me.
>>>>
>>>> Thanks, Pim
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> vfio-users mailing list
>>>> vfio-users at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/vfio-users
>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20180112/1d6b8583/attachment.htm>


More information about the vfio-users mailing list