[vfio-users] Can't read device config space with pread

Alex Williamson alex.williamson at redhat.com
Thu Feb 23 17:52:15 UTC 2017


On Thu, 23 Feb 2017 13:15:54 +0000
Ingrid Ribeiro Galvez <inrigalvez at gmail.com> wrote:

> Hi guys,
> 
> I've been working with qemu kvm for a while and now I need to passthrough
> PCI devices. I did all required procedures to make this work: enabled
> iommu, modprobed vfio module, binded device to vfio and checked that vfio
> group was indeed created, etc... But when I start qemu with any pci devices
> I get the error message:
> 
> *vfio: Failed to read device config space*

This comes from here:

    /* Get a copy of config space */
    ret = pread(vdev->vbasedev.fd, vdev->pdev.config,
                MIN(pci_config_size(&vdev->pdev), vdev->config_size),
                vdev->config_offset);
    if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) {
        ret = ret < 0 ? -errno : -EFAULT;
        error_setg_errno(errp, -ret, "failed to read device config space");
        goto error;
    }

So we got fewer bytes than expected and an errno.  What's the device
look like on the host (lspci -vvv)?  Can you read the full config
space for the device from sysfs
(xxd /sys/bus/pci/devices/0000:01:00.0/config)?
 
> By looking into qemu code I found out that the error was coming from a call
> to pread to read the pci device's file descriptor. It fails with errno
> '*Illegal
> seek*'. Offset being used is 0x70000000000, and this offset seems to be the
> same for all devices and also in different machines. I also wrote some code
> to test reading the pci device file descriptor from outside of the qemu
> code and the pread also fails with 'illegal seek' error. This was done on a
> generic linux kernel v4.7.8 compiled with uClibc for an embedded system.

The offset for each standard region of the device is fixed, PCI config
space is always exposed at the same offset.
 
> If I install ubuntu 16.04 (kernel v4.4.0) on the same machine and repeat
> the steps, pci passthrough works fine and the pread on my test code also
> works perfectly.
> 
> This is the code I am using to test reading the device fd with pread:
> 
> 
> #include <unistd.h>
> #include <stdio.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <linux/vfio.h>
> #include <sys/ioctl.h>
> #include <sys/mman.h>
> 
> #define BUF_SIZE 4096

This presumes the device has a full PCIe config space, is the above
sysfs file 4k in size?
 
> int main(){
>     char buf[BUF_SIZE], buf1[BUF_SIZE], buf2[BUF_SIZE];
> 
>     int ret,group_fd, fd, fd2;
>     size_t nbytes = BUF_SIZE;
>     ssize_t bytes_read;
>     int iommu1, iommu2;
>     unsigned long offset;
>     int container, group, device, i;
>     struct vfio_group_status group_status = { .argsz = sizeof(group_status)
> };
>     struct vfio_iommu_type1_info iommu_info = { .argsz = sizeof(iommu_info)
> };
>     struct vfio_iommu_type1_dma_map dma_map = { .argsz = sizeof(dma_map) };
>     struct vfio_device_info device_info = { .argsz = sizeof(device_info) };
>     struct vfio_region_info reg = { .argsz = sizeof(reg) };
> 
>     container = open("/dev/vfio/vfio",O_RDWR);
>     printf("Container = %d\n",container);
>     if(ioctl(container,VFIO_GET_API_VERSION)!=VFIO_API_VERSION){
>         printf("Unknown api version: %m\n");
>     }
>     group_fd = open("/dev/vfio/1",O_RDWR);
>     printf("Group fd = %d\n", group_fd);
>     ioctl(group_fd, VFIO_GROUP_GET_STATUS, &group_status);
>     if (!(group_status.flags & VFIO_GROUP_FLAGS_VIABLE)){
>         printf("Group not viable\n");
>         getchar();
>         return 1;
>     }
>     ret = ioctl(group_fd, VFIO_GROUP_SET_CONTAINER,&container);
>     ret = ioctl(container,VFIO_SET_IOMMU,VFIO_TYPE1_IOMMU);
> 
>     ioctl(container, VFIO_IOMMU_GET_INFO, &iommu_info);
> 
>     /* Allocate some space and setup a DMA mapping */
>     dma_map.vaddr = (unsigned long int) mmap(0, 1024 * 1024, PROT_READ |
> PROT_WRITE,MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
>     dma_map.size = 1024 * 1024;
>     dma_map.iova = 0; /* 1MB starting at 0x0 from device view */
>     dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
> 
>     ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map);
> 
>     printf("\n\nGETTING DEVICE FD\n");
>     fd = ioctl(group_fd,VFIO_GROUP_GET_DEVICE_FD,"0000:01:00.0");
> 
> 
>     ioctl(fd, VFIO_DEVICE_GET_INFO, &device_info);
>     for (i = 0; i < device_info.num_regions; i++) {
>         reg.index = i;
> 
>         ioctl(fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
> 
>         /* Setup mappings... read/write offsets, mmaps
>         * For PCI devices, config space is a region */
>     }
> 
>     for (i = 0; i < device_info.num_irqs; i++) {
>         struct vfio_irq_info irq = { .argsz = sizeof(irq) };
> 
>         irq.index = i;
> 
>         ioctl(fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
> 
>     }
> 
> 
>     reg.index = VFIO_PCI_CONFIG_REGION_INDEX;
> 
>     printf("VFIO_DEVICE_GET_REGION_INFO = %lu",VFIO_DEVICE_GET_REGION_INFO);
>     ret = ioctl(fd, VFIO_DEVICE_GET_REGION_INFO, &reg);
> 
>     offset = reg.offset;
>     printf("offset is %lx\n",offset);
>     /*ret = read(group_fd,buf,nbytes);
>     printf("Read from group fd, ret is %d: %m\n",ret);
>     printf("CONFIG SPACE: \n");
>     printf("%s\n",buf);*/
>     printf("Fd = %d\n",fd);
> 
>     //printf("VFIO_GROUP_GET_DEV_ID = %lu\n",VFIO_GROUP_GET_DEVICE_FD);
>     ret = read(fd,buf,nbytes);

This reads from offset 0, which is BAR0, which is possibly not enabled
since you haven't enabled I/O or MMIO access to the device in the PCI
COMMAND register in config space.  Results here are going to depend on
the state of the device as you receive it, and whether you can even
read 4K from BAR0 space.

>     printf("Ret from read is = %d, buf = %s\n",ret,buf);
>     if(ret<1){
>         printf("ERROR: %m \n");
>     }
> 
>     ret = pread(fd,buf,nbytes,offset);

This one should actually read from config space.
 
>     printf("Ret from pread is = %d\n",ret);
>     if(ret<1){
>         printf("ERROR: %m \n");
>     }

So this is where you get an ESPIPE error?  Do different sizes work?
256 bytes?  64 bytes?

>     printf("TESTING PREAD ON A COMMON FILE\n");
>     fd2 = open("/sys/bus/pci/devices/0000:01:00.0/device",O_RDONLY);
>     printf("FD2 = %d\n",fd2);
>     ret = read(fd2,buf1,nbytes);
>     if(ret<0){
>         printf("ERROR: %m\n");
>     }
>     printf("Result from read: ret = %d, content = %s\n",ret,buf1);
>     ret = pread(fd2,buf2,nbytes,2);
>     if(ret<0){
>         printf("ERROR: %m\n");
>     }
>     printf("Result from pread: ret = %d, content = %s\n",ret,buf2);

Did these work?

>     close(fd2);
>     getchar();
>     close(fd);
>     close(container);
>     close(group_fd);
>     return 0;
> }
> 
> 
> Something weird I noticed that might be related to this  is that on ubuntu
> the iommu groups for some devices are very different from the manually
> compiled kernel. There are a few devices that on ubuntu have a large
> iommu_group while in the generic kernel the iommu group is composed by only
> one device ( and this is in the same machine btw!). Is this normal?
> Other thing I tried was using 0 as offset to pread and this gives me the
> same error even though a normal read works fine....

The ubuntu kernel is older, perhaps it doesn't include quirks to enable
ACS equivalent isolation on the PCH root ports.  That would explain the
group differences.  Thanks,

Alex




More information about the vfio-users mailing list