[libvirt] [PATCH v7] add 802.1Qbh and 802.1Qbg handling

Laine Stump laine at laine.org
Tue May 25 17:20:36 UTC 2010


This patch was somehow broken - many of the lines have been 
word-wrapped, so patch bombs out. The v6 patch you sent didn't have this 
problem...

Can you resend it using the same method you used for v6? (assuming 
something changed ;-)

On 05/25/2010 12:54 PM, Stefan Berger wrote:
> This patch builds on the work recently posted by Stefan Berger.  It builds
> on top of Stefan's three posted patches:
>
>          [PATCH v10] vepa: parsing for 802.1Qb{g|h} XML
>          [RFC][PATCH 1/3] vepa+vsi: Introduce dependency on libnl
>          [PATCH v3] Add host UUID (to libvirt capabilities)
>
> Stefan's RFC patches 2/3 and 3/3 are incorporated into my patch.
>
> V7:
>    - Addressing Jim Meyering's comments; this also touches existing
>      code for example for correcting indentation of break statements or
>      simplification of switch statements.
>
> Changes from v5 to v6:
>    - Renamed occurrencvirVirtualPortProfileDef to
> virVirtualPortProfileParamses
>    - 802.1Qbg part prepared for sending a RTM_SETLINK and getting
>      processing status back plus a subsequent RTM_GETLINK to
>      get IFLA_PORT_RESPONSE.
>      Note: This interface for 802.1Qbg may still change
>
> Changes from v4 to v5:
>    - [David Allan] move getPhysfn inside IFLA_VF_PORT_MAX to avoid
> compiler
>      warning when latest if_link.h isn't available
>
> Changes from v3 to v4:
>    - move from Stefan's 802.1Qb{g|h} XML v8 to v9
>    - move hostuuid and vf index calcs to inside doPortProfileOp8021Qbh
>
> Changes from v2 to v3:
>    - remove debug fprintfs
>    - use virGetHostUUID (thanks Stefan!)
>    - fix compile issue when latest if_link.h isn't available
>    - change poll timeout to 10s, at 1/8 intervals
>       - if polling times out, log msg and return -ETIMEDOUT
>
> Changes from v1 to v2:
>    - Add Stefan's code for getPortProfileStatus
>    - Poll for up to 2 secs for port-profile status, at 1/8 sec intervals:
>       - if status indicates error, abort openMacvtapTap
>       - if status indicates success, exit polling
>       - if status is "in-progress" after 2 secs of polling, exit
>         polling loop silently, without error
>
> My patch finishes out the 802.1Qbh parts, which Stefan had mostly complete.
> I've tested using the recent kernel updates for VF_PORT netlink msgs and
> enic for Cisco's 10G Ethernet NIC.  I tested many VMs, each with several
> direct interfaces, each configured with a port-profile per the XML.  VM-to-VM,
> and VM-to-external work as expected.  VM-to-VM on same host (using same NIC)
> works same as VM-to-VM where VMs are on diff hosts.  I'm able to change
> settings on the port-profile while the VM is running to change the virtual
> port behaviour.  For example, adjusting a QoS setting like rate limit.  All
> VMs with interfaces using that port-profile immediatly see the effect of the
> change to the port-profile.
>
> I don't have a SR-IOV device to test so source dev is a non-SR-IOV device,
> but most of the code paths include support for specifing the source dev and
> VF index.  We'll need to complete this by discovering the PF given the VF
> linkdev.  Once we have the PF, we'll also have the VF index.  All this info-
> mation is available from sysfs.
>
> Signed-off-by: Scott Feldman<scofeldm at cisco.com>
> Signed-off-by: Stefan Berger<stefanb at us.ibm.com>
>
> ---
>   configure.ac           |   16
>   src/qemu/qemu_conf.c   |    2
>   src/qemu/qemu_driver.c |    4
>   src/util/macvtap.c     |  837
> +++++++++++++++++++++++++++++++++++++++++++++----
>   src/util/macvtap.h     |    1
>   5 files changed, 788 insertions(+), 72 deletions(-)
>
>
> Index: libvirt-acl/configure.ac
> ===================================================================
> --- libvirt-acl.orig/configure.ac
> +++ libvirt-acl/configure.ac
> @@ -2005,13 +2005,26 @@ if test "$with_macvtap" != "no" ; then
>   fi
>   AM_CONDITIONAL([WITH_MACVTAP], [test "$with_macvtap" = "yes"])
>
> +AC_TRY_COMPILE([ #include<sys/socket.h>
> +                 #include<linux/rtnetlink.h>  ],
> +                 [ int x = IFLA_PORT_MAX; ],
> +                 [ with_virtualport=yes ],
> +                 [ with_virtualport=no ])
> +if test "$with_virtualport" = "yes"; then
> +    val=1
> +else
> +    val=0
> +fi
> +AC_DEFINE_UNQUOTED([WITH_VIRTUALPORT], $val, [whether vsi vepa support
> is enabled])
> +AM_CONDITIONAL([WITH_VIRTUALPORT], [test "$with_virtualport" = "yes"])
> +
>
>   dnl netlink library
>
>   LIBNL_CFLAGS=""
>   LIBNL_LIBS=""
>
> -if test "$with_macvtap" = "yes"; then
> +if test "$with_macvtap" = "yes" || test "$with_virtualport" = "yes";
> then
>       PKG_CHECK_MODULES([LIBNL], [libnl-1>= $LIBNL_REQUIRED], [
>       ], [
>           AC_MSG_ERROR([libnl>= $LIBNL_REQUIRED is required for macvtap
> support])
> @@ -2084,6 +2097,7 @@ AC_MSG_NOTICE([ Network: $with_network])
>   AC_MSG_NOTICE([Libvirtd: $with_libvirtd])
>   AC_MSG_NOTICE([   netcf: $with_netcf])
>   AC_MSG_NOTICE([ macvtap: $with_macvtap])
> +AC_MSG_NOTICE([virtport: $with_virtualport])
>   AC_MSG_NOTICE([])
>   AC_MSG_NOTICE([Storage Drivers])
>   AC_MSG_NOTICE([])
> Index: libvirt-acl/src/qemu/qemu_conf.c
> ===================================================================
> --- libvirt-acl.orig/src/qemu/qemu_conf.c
> +++ libvirt-acl/src/qemu/qemu_conf.c
> @@ -1509,7 +1509,7 @@ qemudPhysIfaceConnect(virConnectPtr conn
>               if (err) {
>                   close(rc);
>                   rc = -1;
> -                delMacvtap(net->ifname,
> +                delMacvtap(net->ifname, net->data.direct.linkdev,
>                              &net->data.direct.virtPortProfile);
>               }
>           }
> Index: libvirt-acl/src/qemu/qemu_driver.c
> ===================================================================
> --- libvirt-acl.orig/src/qemu/qemu_driver.c
> +++ libvirt-acl/src/qemu/qemu_driver.c
> @@ -3709,7 +3709,7 @@ static void qemudShutdownVMDaemon(struct
>       for (i = 0; i<  def->nnets; i++) {
>           virDomainNetDefPtr net = def->nets[i];
>           if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT)
> -            delMacvtap(net->ifname,
> +            delMacvtap(net->ifname, net->data.direct.linkdev,
>                          &net->data.direct.virtPortProfile);
>       }
>   #endif
> @@ -8514,7 +8514,7 @@ qemudDomainDetachNetDevice(struct qemud_
>
>   #if WITH_MACVTAP
>       if (detach->type == VIR_DOMAIN_NET_TYPE_DIRECT)
> -        delMacvtap(detach->ifname,
> +        delMacvtap(detach->ifname, detach->data.direct.linkdev,
>                      &detach->data.direct.virtPortProfile);
>   #endif
>
> Index: libvirt-acl/src/util/macvtap.c
> ===================================================================
> --- libvirt-acl.orig/src/util/macvtap.c
> +++ libvirt-acl/src/util/macvtap.c
> @@ -27,7 +27,7 @@
>
>   #include<config.h>
>
> -#if WITH_MACVTAP
> +#if WITH_MACVTAP || WITH_VIRTUALPORT
>
>   # include<stdio.h>
>   # include<errno.h>
> @@ -41,6 +41,8 @@
>   # include<linux/rtnetlink.h>
>   # include<linux/if_tun.h>
>
> +# include<netlink/msg.h>
> +
>   # include "util.h"
>   # include "memory.h"
>   # include "logging.h"
> @@ -48,6 +50,7 @@
>   # include "interface.h"
>   # include "conf/domain_conf.h"
>   # include "virterror_internal.h"
> +# include "uuid.h"
>
>   # define VIR_FROM_THIS VIR_FROM_NET
>
> @@ -58,15 +61,30 @@
>   # define MACVTAP_NAME_PREFIX	"macvtap"
>   # define MACVTAP_NAME_PATTERN	"macvtap%d"
>
> +# define MICROSEC_PER_SEC       (1000 * 1000)
> +
> +# define NLMSGBUF_SIZE  256
> +# define RATTBUF_SIZE   64
> +
> +
> +# define STATUS_POLL_TIMEOUT_USEC (10 * MICROSEC_PER_SEC)
> +# define STATUS_POLL_INTERVL_USEC (MICROSEC_PER_SEC / 8)
> +
>
>   static int associatePortProfileId(const char *macvtap_ifname,
> +                                  const char *linkdev,
>                                     const virVirtualPortProfileParamsPtr
> virtPort,
> -                                  int vf,
>                                     const unsigned char *vmuuid);
>
>   static int disassociatePortProfileId(const char *macvtap_ifname,
> +                                     const char *linkdev,
>                                        const
> virVirtualPortProfileParamsPtr virtPort);
>
> +enum virVirtualPortOp {
> +    ASSOCIATE = 0x1,
> +    DISASSOCIATE = 0x2,
> +};
> +
>
>   static int nlOpen(void)
>   {
> @@ -97,7 +115,7 @@ static void nlClose(int fd)
>    */
>   static
>   int nlComm(struct nlmsghdr *nlmsg,
> -           char **respbuf, int *respbuflen)
> +           char **respbuf, unsigned int *respbuflen)
>   {
>       int rc = 0;
>       struct sockaddr_nl nladdr = {
> @@ -159,6 +177,162 @@ err_exit:
>   }
>
>
> +# ifdef IFLA_VF_PORT_MAX
> +
> +/**
> + * nlCommWaitSuccess:
> + *
> + * @nlmsg: pointer to netlink message
> + * @nl_grousp: the netlink multicast groups to send to
> + * @respbuf: pointer to pointer where response buffer will be allocated
> + * @respbuflen: pointer to integer holding the size of the response
> buffer
> + *      on return of the function.
> + * @timeout_usecs: timeout in microseconds to wait for a success
> message
> + *                 to be returned
> + *
> + * Send the given message to the netlink multicast group and receive
> + * responses. Skip responses indicating an error and keep on receiving
> + * responses until a success response is returned.
> + * Returns 0 on success, -1 on error. In case of error, no response
> + * buffer will be returned.
> + */
> +static int
> +nlCommWaitSuccess(struct nlmsghdr *nlmsg, uint32_t nl_groups,
> +                  char **respbuf, unsigned int *respbuflen,
> +                  unsigned long long timeout_usecs)
> +{
> +    int rc = 0;
> +    struct sockaddr_nl nladdr = {
> +            .nl_family = AF_NETLINK,
> +            .nl_pid    = getpid(),
> +            .nl_groups = nl_groups,
> +    };
> +    int rcvChunkSize = 1024; // expecting less than that
> +    size_t rcv_offset = 0;
> +    ssize_t nbytes;
> +    struct timeval tv = {
> +        .tv_sec  = timeout_usecs / MICROSEC_PER_SEC,
> +        .tv_usec = timeout_usecs % MICROSEC_PER_SEC,
> +    };
> +    bool got_valid = false;
> +    int fd = nlOpen();
> +    static uint32_t seq = 0x1234;
> +    uint32_t myseq = seq++;
> +    uint32_t mypid = getpid();
> +
> +    if (fd<  0)
> +        return -1;
> +
> +    nlmsg->nlmsg_pid = mypid;
> +    nlmsg->nlmsg_seq = myseq;
> +    nlmsg->nlmsg_flags |= NLM_F_ACK;
> +
> +    nbytes = sendto(fd, (void *)nlmsg, nlmsg->nlmsg_len, 0,
> +                    (struct sockaddr *)&nladdr, sizeof(nladdr));
> +    if (nbytes<  0) {
> +        virReportSystemError(errno,
> +                             "%s", _("cannot send to netlink socket"));
> +        rc = -1;
> +        goto err_exit;
> +    }
> +
> +    while (!got_valid) {
> +
> +        rcv_offset = 0;
> +
> +        while (1) {
> +            int n;
> +            fd_set rfds;
> +            socklen_t addrlen = sizeof(nladdr);
> +
> +            if (VIR_REALLOC_N(*respbuf, rcv_offset + rcvChunkSize)<  0)
> {
> +                virReportOOMError();
> +                rc = -1;
> +                goto err_exit;
> +            }
> +
> +            FD_ZERO(&rfds);
> +            FD_SET(fd,&rfds);
> +
> +            n = select(fd + 1,&rfds, NULL, NULL,&tv);
> +            if (n<= 0) {
> +                if (n<  0)
> +                    virReportSystemError(errno, "%s",
> +                                         _("error in select call"));
> +                if (n == 0)
> +                    virReportSystemError(ETIMEDOUT, "%s",
> +                            _("no valid netlink response was
> received"));
> +                rc = -1;
> +                goto err_exit;
> +            }
> +
> +            nbytes = recvfrom(fd,&((*respbuf)[rcv_offset]),
> rcvChunkSize, 0,
> +                              (struct sockaddr *)&nladdr,&addrlen);
> +            if (nbytes<  0) {
> +                if (errno == EAGAIN || errno == EINTR)
> +                    continue;
> +                virReportSystemError(errno, "%s",
> +                                     _("error receiving from netlink
> socket"));
> +                rc = -1;
> +                goto err_exit;
> +            }
> +            rcv_offset += nbytes;
> +            break;
> +        }
> +        *respbuflen = rcv_offset;
> +
> +        /* check message for error */
> +        if (*respbuflen>  NLMSG_LENGTH(0)&&  *respbuf != NULL) {
> +            struct nlmsghdr *resp = (struct nlmsghdr *)*respbuf;
> +            struct nlmsgerr *err;
> +
> +            if (resp->nlmsg_pid != mypid ||
> +                resp->nlmsg_seq != myseq)
> +                continue;
> +
> +            /* skip reflected message */
> +            if (resp->nlmsg_type&  0x10)
> +                continue;
> +
> +            switch (resp->nlmsg_type) {
> +               case NLMSG_ERROR:
> +                  err = (struct nlmsgerr *)NLMSG_DATA(resp);
> +                  if (resp->nlmsg_len>= NLMSG_LENGTH(sizeof(*err))) {
> +                      if (err->error != -EOPNOTSUPP) {
> +                          /* assuming error msg from daemon */
> +                          got_valid = true;
> +                          break;
> +                      }
> +                  }
> +                  /* whatever this is, skip it */
> +                  VIR_FREE(*respbuf);
> +                  *respbuflen = 0;
> +                  break;
> +
> +               case NLMSG_DONE:
> +                  got_valid = true;
> +                  break;
> +
> +               default:
> +                  VIR_FREE(*respbuf);
> +                  *respbuflen = 0;
> +                  break;
> +            }
> +        }
> +    }
> +
> +err_exit:
> +    if (rc == -1) {
> +        VIR_FREE(*respbuf);
> +        *respbuflen = 0;
> +    }
> +
> +    nlClose(fd);
> +    return rc;
> +}
> +
> +# endif
> +
>   static struct rtattr *
>   rtattrCreate(char *buffer, int bufsize, int type,
>                const void *data, int datalen)
> @@ -204,6 +378,8 @@ nlAppend(struct nlmsghdr *nlm, int totle
>   }
>
>
> +# if WITH_MACVTAP
> +
>   static int
>   link_add(const char *type,
>            const unsigned char *macaddress, int macaddrsize,
> @@ -213,15 +389,15 @@ link_add(const char *type,
>            int *retry)
>   {
>       int rc = 0;
> -    char nlmsgbuf[256];
> +    char nlmsgbuf[NLMSGBUF_SIZE];
>       struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
>       struct nlmsgerr *err;
> -    char rtattbuf[64];
> +    char rtattbuf[RATTBUF_SIZE];
>       struct rtattr *rta, *rta1, *li;
> -    struct ifinfomsg i = { .ifi_family = AF_UNSPEC };
> +    struct ifinfomsg ifinfo = { .ifi_family = AF_UNSPEC };
>       int ifindex;
>       char *recvbuf = NULL;
> -    int recvbuflen;
> +    unsigned int recvbuflen;
>
>       if (ifaceGetIndex(true, srcdev,&ifindex) != 0)
>           return -1;
> @@ -232,65 +408,46 @@ link_add(const char *type,
>
>       nlInit(nlm, NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL,
> RTM_NEWLINK);
>
> -    if (!nlAppend(nlm, sizeof(nlmsgbuf),&i, sizeof(i)))
> +    if (!nlAppend(nlm, sizeof(nlmsgbuf),&ifinfo, sizeof(ifinfo)))
>           goto buffer_too_small;
>
>       rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_LINK,
>                          &ifindex, sizeof(ifindex));
> -    if (!rta)
> -        goto buffer_too_small;
> -
> -    if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
> +    if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
>           goto buffer_too_small;
>
>       rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_ADDRESS,
>                          macaddress, macaddrsize);
> -    if (!rta)
> -        goto buffer_too_small;
> -
> -    if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
> +    if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
>           goto buffer_too_small;
>
>       if (ifname) {
>           rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_IFNAME,
>                              ifname, strlen(ifname) + 1);
> -        if (!rta)
> -            goto buffer_too_small;
> -
> -        if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
>               goto buffer_too_small;
>       }
>
>       rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_LINKINFO, NULL,
> 0);
> -    if (!rta)
> -        goto buffer_too_small;
> -
> -    if (!(li = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len)))
> +    if (!rta ||
> +        !(li = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len)))
>           goto buffer_too_small;
>
>       rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_INFO_KIND,
>                          type, strlen(type));
> -    if (!rta)
> -        goto buffer_too_small;
> -
> -    if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
> +    if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
>           goto buffer_too_small;
>
>       if (macvlan_mode>  0) {
>           rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_INFO_DATA,
>                              NULL, 0);
> -        if (!rta)
> -            goto buffer_too_small;
> -
> -        if (!(rta1 = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len)))
> +        if (!rta ||
> +            !(rta1 = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len)))
>               goto buffer_too_small;
>
>           rta = rtattrCreate(rtattbuf, sizeof(rtattbuf),
> IFLA_MACVLAN_MODE,
>                              &macvlan_mode, sizeof(macvlan_mode));
> -        if (!rta)
> -            goto buffer_too_small;
> -
> -        if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
>               goto buffer_too_small;
>
>           rta1->rta_len = (char *)nlm + nlm->nlmsg_len - (char *)rta1;
> @@ -312,15 +469,15 @@ link_add(const char *type,
>           if (resp->nlmsg_len<  NLMSG_LENGTH(sizeof(*err)))
>               goto malformed_resp;
>
> -        switch (-err->error) {
> +        switch (err->error) {
>
>           case 0:
> -        break;
> +            break;
>
> -        case EEXIST:
> +        case -EEXIST:
>               *retry = 1;
>               rc = -1;
> -        break;
> +            break;
>
>           default:
>               virReportSystemError(-err->error,
> @@ -328,10 +485,10 @@ link_add(const char *type,
>                                    type);
>               rc = -1;
>           }
> -    break;
> +        break;
>
>       case NLMSG_DONE:
> -    break;
> +        break;
>
>       default:
>           goto malformed_resp;
> @@ -358,14 +515,14 @@ static int
>   link_del(const char *name)
>   {
>       int rc = 0;
> -    char nlmsgbuf[256];
> +    char nlmsgbuf[NLMSGBUF_SIZE];
>       struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
>       struct nlmsgerr *err;
> -    char rtattbuf[64];
> +    char rtattbuf[RATTBUF_SIZE];
>       struct rtattr *rta;
>       struct ifinfomsg ifinfo = { .ifi_family = AF_UNSPEC };
>       char *recvbuf = NULL;
> -    int recvbuflen;
> +    unsigned int recvbuflen;
>
>       memset(&nlmsgbuf, 0, sizeof(nlmsgbuf));
>
> @@ -376,10 +533,7 @@ link_del(const char *name)
>
>       rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_IFNAME,
>                          name, strlen(name)+1);
> -    if (!rta)
> -        goto buffer_too_small;
> -
> -    if (!nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf, rta->rta_len))
> +    if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
>           goto buffer_too_small;
>
>       if (nlComm(nlm,&recvbuf,&recvbuflen)<  0)
> @@ -396,20 +550,16 @@ link_del(const char *name)
>           if (resp->nlmsg_len<  NLMSG_LENGTH(sizeof(*err)))
>               goto malformed_resp;
>
> -        switch (-err->error) {
> -        case 0:
> -        break;
> -
> -        default:
> +        if (err->error) {
>               virReportSystemError(-err->error,
>                                    _("error destroying %s interface"),
>                                    name);
>               rc = -1;
>           }
> -    break;
> +        break;
>
>       case NLMSG_DONE:
> -    break;
> +        break;
>
>       default:
>           goto malformed_resp;
> @@ -509,11 +659,9 @@ macvtapModeFromInt(enum virDomainNetdevM
>       switch (mode) {
>       case VIR_DOMAIN_NETDEV_MACVTAP_MODE_PRIVATE:
>           return MACVLAN_MODE_PRIVATE;
> -    break;
>
>       case VIR_DOMAIN_NETDEV_MACVTAP_MODE_BRIDGE:
>           return MACVLAN_MODE_BRIDGE;
> -    break;
>
>       case VIR_DOMAIN_NETDEV_MACVTAP_MODE_VEPA:
>       default:
> @@ -655,8 +803,8 @@ create_name:
>       }
>
>       if (associatePortProfileId(cr_ifname,
> +                               linkdev,
>                                  virtPortProfile,
> -                               -1,
>                                  vmuuid) != 0) {
>           rc = -1;
>           goto link_del_exit;
> @@ -689,6 +837,7 @@ create_name:
>
>   disassociate_exit:
>       disassociatePortProfileId(cr_ifname,
> +                              linkdev,
>                                 virtPortProfile);
>
>   link_del_exit:
> @@ -701,6 +850,7 @@ link_del_exit:
>   /**
>    * delMacvtap:
>    * @ifname : The name of the macvtap interface
> + * @linkdev: The interface name of the NIC to connect to the external
> bridge
>    * @virtPortProfile: pointer to object holding the virtual port profile
> data
>    *
>    * Delete an interface given its name. Disassociate
> @@ -709,24 +859,565 @@ link_del_exit:
>    */
>   void
>   delMacvtap(const char *ifname,
> +           const char *linkdev,
>              virVirtualPortProfileParamsPtr virtPortProfile)
>   {
>       if (ifname) {
>           disassociatePortProfileId(ifname,
> +                                  linkdev,
>                                     virtPortProfile);
>           link_del(ifname);
>       }
>   }
>
> -#endif
> +# endif
> +
> +
> +# ifdef IFLA_PORT_MAX
> +
> +static struct nla_policy ifla_policy[IFLA_MAX + 1] =
> +{
> +  [IFLA_VF_PORTS] = { .type = NLA_NESTED },
> +};
> +
> +static struct nla_policy ifla_vf_ports_policy[IFLA_VF_PORT_MAX + 1] =
> +{
> +  [IFLA_VF_PORT] = { .type = NLA_NESTED },
> +};
> +
> +static struct nla_policy ifla_port_policy[IFLA_PORT_MAX + 1] =
> +{
> +  [IFLA_PORT_RESPONSE]      = { .type = NLA_U16 },
> +};
> +
> +
> +static int
> +link_dump(bool multicast, int ifindex, struct nlattr **tb, char
> **recvbuf)
> +{
> +    int rc = 0;
> +    char nlmsgbuf[NLMSGBUF_SIZE] = { 0, };
> +    struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
> +    struct nlmsgerr *err;
> +    struct ifinfomsg ifinfo = {
> +        .ifi_family = AF_UNSPEC,
> +        .ifi_index  = ifindex
> +    };
> +    unsigned int recvbuflen;
> +
> +    *recvbuf = NULL;
> +
> +    nlInit(nlm, NLM_F_REQUEST, RTM_GETLINK);
> +
> +    if (!nlAppend(nlm, sizeof(nlmsgbuf),&ifinfo, sizeof(ifinfo)))
> +        goto buffer_too_small;
> +
> +    if (!multicast) {
> +        if (nlComm(nlm, recvbuf,&recvbuflen)<  0)
> +            return -1;
> +    } else {
> +        if (nlCommWaitSuccess(nlm, RTMGRP_LINK, recvbuf,&recvbuflen,
> +                              5 * MICROSEC_PER_SEC)<  0)
> +            return -1;
> +    }
> +
> +    if (recvbuflen<  NLMSG_LENGTH(0) || *recvbuf == NULL)
> +        goto malformed_resp;
> +
> +    resp = (struct nlmsghdr *)*recvbuf;
> +
> +    switch (resp->nlmsg_type) {
> +    case NLMSG_ERROR:
> +        err = (struct nlmsgerr *)NLMSG_DATA(resp);
> +        if (resp->nlmsg_len<  NLMSG_LENGTH(sizeof(*err)))
> +            goto malformed_resp;
> +
> +        if (err->error) {
> +            virReportSystemError(-err->error,
> +                                 _("error dumping %d interface"),
> +                                 ifindex);
> +            rc = -1;
> +        }
> +        break;
> +
> +    case GENL_ID_CTRL:
> +    case NLMSG_DONE:
> +        if (nlmsg_parse(resp, sizeof(struct ifinfomsg),
> +                        tb, IFLA_MAX, ifla_policy)) {
> +            goto malformed_resp;
> +        }
> +        break;
> +
> +    default:
> +        goto malformed_resp;
> +    }
> +
> +    if (rc != 0)
> +        VIR_FREE(*recvbuf);
> +
> +    return rc;
> +
> +malformed_resp:
> +    macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                 _("malformed netlink response message"));
> +    VIR_FREE(*recvbuf);
> +    return -1;
> +
> +buffer_too_small:
> +    macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                 _("internal buffer is too small"));
> +    return -1;
> +}
> +
> +
> +static int
> +getPortProfileStatus(struct nlattr **tb, int32_t vf, uint16_t *status)
> +{
> +    int rc = 1;
> +    const char *msg = NULL;
> +    struct nlattr *tb2[IFLA_VF_PORT_MAX + 1],
> +                  *tb3[IFLA_PORT_MAX+1];
> +
> +    if (vf == PORT_SELF_VF) {
> +        if (tb[IFLA_PORT_SELF]) {
> +            if (nla_parse_nested(tb3, IFLA_PORT_MAX,
> tb[IFLA_PORT_SELF],
> +                                 ifla_port_policy)) {
> +                msg = _("error parsing nested IFLA_VF_PORT part");
> +                goto err_exit;
> +            }
> +        }
> +    } else {
> +        if (tb[IFLA_VF_PORTS]) {
> +            if (nla_parse_nested(tb2, IFLA_VF_PORT_MAX,
> tb[IFLA_VF_PORTS],
> +                                 ifla_vf_ports_policy)) {
> +                msg = _("error parsing nested IFLA_VF_PORTS part");
> +                goto err_exit;
> +            }
> +            if (tb2[IFLA_VF_PORT]) {
> +                if (nla_parse_nested(tb3, IFLA_PORT_MAX,
> tb2[IFLA_VF_PORT],
> +                                     ifla_port_policy)) {
> +                    msg = _("error parsing nested IFLA_VF_PORT part");
> +                    goto err_exit;
> +                }
> +            }
> +        }
> +    }
> +
> +    if (tb3[IFLA_PORT_RESPONSE]) {
> +        *status = *(uint16_t *)RTA_DATA(tb3[IFLA_PORT_RESPONSE]);
> +         rc = 0;
> +    } else {
> +         msg = _("no IFLA_PORT_RESPONSE found in netlink message");
> +         goto err_exit;
> +    }
> +
> +err_exit:
> +    if (msg)
> +        macvtapError(VIR_ERR_INTERNAL_ERROR, "%s", msg);
> +
> +    return rc;
> +}
> +
> +
> +static int
> +doPortProfileOpSetLink(bool multicast,
> +                       int ifindex,
> +                       const char *profileId,
> +                       struct ifla_port_vsi *portVsi,
> +                       const unsigned char *instanceId,
> +                       const unsigned char *hostUUID,
> +                       int32_t vf,
> +                       uint8_t op)
> +{
> +    int rc = 0;
> +    char nlmsgbuf[NLMSGBUF_SIZE];
> +    struct nlmsghdr *nlm = (struct nlmsghdr *)nlmsgbuf, *resp;
> +    struct nlmsgerr *err;
> +    char rtattbuf[RATTBUF_SIZE];
> +    struct rtattr *rta, *vfports = NULL, *vfport;
> +    struct ifinfomsg ifinfo = {
> +        .ifi_family = AF_UNSPEC,
> +        .ifi_index  = ifindex,
> +    };
> +    char *recvbuf = NULL;
> +    unsigned int recvbuflen = 0;
> +
> +    memset(&nlmsgbuf, 0, sizeof(nlmsgbuf));
> +
> +    nlInit(nlm, NLM_F_REQUEST, RTM_SETLINK);
> +
> +    if (!nlAppend(nlm, sizeof(nlmsgbuf),&ifinfo, sizeof(ifinfo)))
> +        goto buffer_too_small;
> +
> +    if (vf == PORT_SELF_VF) {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_SELF,
> NULL, 0);
> +    } else {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_VF_PORTS,
> NULL, 0);
> +        if (!rta ||
> +            !(vfports = nlAppend(nlm, sizeof(nlmsgbuf),
> +                                 rtattbuf, rta->rta_len)))
> +            goto buffer_too_small;
> +
> +        /* begin nesting vfports */
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_VF_PORT,
> NULL, 0);
> +    }
> +
> +    if (!rta ||
> +        !(vfport = nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len)))
> +        goto buffer_too_small;
> +
> +    if (profileId) {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf),
> IFLA_PORT_PROFILE,
> +                           profileId, strlen(profileId) + 1);
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
> +            goto buffer_too_small;
> +    }
> +
> +    if (portVsi) {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf),
> IFLA_PORT_VSI_TYPE,
> +                           portVsi, sizeof(*portVsi));
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
> +            goto buffer_too_small;
> +    }
> +
> +    if (instanceId) {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf),
> IFLA_PORT_INSTANCE_UUID,
> +                           instanceId, VIR_UUID_BUFLEN);
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
> +            goto buffer_too_small;
> +    }
> +
> +    if (hostUUID) {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf),
> IFLA_PORT_HOST_UUID,
> +                           hostUUID, VIR_UUID_BUFLEN);
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
> +            goto buffer_too_small;
> +    }
> +
> +    if (vf != PORT_SELF_VF) {
> +        rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_VF,
> +&vf, sizeof(vf));
> +        if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
> +            goto buffer_too_small;
> +    }
> +
> +    rta = rtattrCreate(rtattbuf, sizeof(rtattbuf), IFLA_PORT_REQUEST,
> +&op, sizeof(op));
> +    if (!rta || !nlAppend(nlm, sizeof(nlmsgbuf), rtattbuf,
> rta->rta_len))
> +        goto buffer_too_small;
> +
> +    /* end nesting of vport */
> +    vfport->rta_len  = (char *)nlm + nlm->nlmsg_len - (char *)vfport;
> +
> +    if (vf != PORT_SELF_VF) {
> +        /* end nesting of vfports */
> +        vfports->rta_len = (char *)nlm + nlm->nlmsg_len - (char
> *)vfports;
> +    }
> +
> +    if (!multicast) {
> +        if (nlComm(nlm,&recvbuf,&recvbuflen)<  0)
> +            return -1;
> +    } else {
> +        if (nlCommWaitSuccess(nlm, RTMGRP_LINK,&recvbuf,&recvbuflen,
> +                              5 * MICROSEC_PER_SEC)<  0)
> +            return -1;
> +    }
> +
> +    if (recvbuflen<  NLMSG_LENGTH(0) || recvbuf == NULL)
> +        goto malformed_resp;
> +
> +    resp = (struct nlmsghdr *)recvbuf;
> +
> +    switch (resp->nlmsg_type) {
> +    case NLMSG_ERROR:
> +        err = (struct nlmsgerr *)NLMSG_DATA(resp);
> +        if (resp->nlmsg_len<  NLMSG_LENGTH(sizeof(*err)))
> +            goto malformed_resp;
> +
> +        if (err->error) {
> +            virReportSystemError(-err->error,
> +                _("error during virtual port configuration of ifindex %
> d"),
> +                ifindex);
> +            rc = -1;
> +        }
> +        break;
> +
> +    case NLMSG_DONE:
> +        break;
> +
> +    default:
> +        goto malformed_resp;
> +    }
> +
> +    VIR_FREE(recvbuf);
> +
> +    return rc;
> +
> +malformed_resp:
> +    macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                 _("malformed netlink response message"));
> +    VIR_FREE(recvbuf);
> +    return -1;
> +
> +buffer_too_small:
> +    macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                 _("internal buffer is too small"));
> +    return -1;
> +}
> +
> +
> +static int
> +doPortProfileOpCommon(bool multicast,
> +                      int ifindex,
> +                      const char *profileId,
> +                      struct ifla_port_vsi *portVsi,
> +                      const unsigned char *instanceId,
> +                      const unsigned char *hostUUID,
> +                      int32_t vf,
> +                      uint8_t op)
> +{
> +    int rc;
> +    char *recvbuf = NULL;
> +    struct nlattr *tb[IFLA_MAX + 1];
> +    int repeats = STATUS_POLL_TIMEOUT_USEC / STATUS_POLL_INTERVL_USEC;
> +    uint16_t status = 0;
> +
> +    rc = doPortProfileOpSetLink(multicast,
> +                                ifindex,
> +                                profileId,
> +                                portVsi,
> +                                instanceId,
> +                                hostUUID,
> +                                vf,
> +                                op);
> +
> +    if (rc != 0) {
> +        macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                     _("sending of PortProfileRequest failed."));
> +        return rc;
> +    }
> +
> +    while (--repeats>= 0) {
> +        rc = link_dump(multicast, ifindex, tb,&recvbuf);
> +        if (rc)
> +            goto err_exit;
> +        rc = getPortProfileStatus(tb, vf,&status);
> +        if (rc == 0) {
> +            if (status == PORT_PROFILE_RESPONSE_SUCCESS ||
> +                status == PORT_VDP_RESPONSE_SUCCESS) {
> +                break;
> +            } else if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
> +                // keep trying...
> +            } else {
> +                virReportSystemError(EINVAL,
> +                    _("error %d during port-profile setlink on ifindex
> %d"),
> +                    status, ifindex);
> +                rc = 1;
> +                break;
> +            }
> +        } else
> +            goto err_exit;
>
> +        usleep(STATUS_POLL_INTERVL_USEC);
> +
> +        VIR_FREE(recvbuf);
> +    }
> +
> +    if (status == PORT_PROFILE_RESPONSE_INPROGRESS) {
> +        macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                     _("port-profile setlink timed out"));
> +        rc = -ETIMEDOUT;
> +    }
> +
> +err_exit:
> +    VIR_FREE(recvbuf);
> +
> +    return rc;
> +}
> +
> +# endif /* IFLA_PORT_MAX */
> +
> +static int
> +doPortProfileOp8021Qbg(const char *ifname,
> +                       const virVirtualPortProfileParamsPtr virtPort,
> +                       enum virVirtualPortOp virtPortOp)
> +{
> +    int rc;
> +
> +# ifndef IFLA_VF_PORT_MAX
> +
> +    (void)ifname;
> +    (void)virtPort;
> +    (void)virtPortOp;
> +    macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                 _("Kernel VF Port support was missing at compile
> time."));
> +    rc = 1;
> +
> +# else /* IFLA_VF_PORT_MAX */
> +
> +    int op = PORT_REQUEST_ASSOCIATE;
> +    struct ifla_port_vsi portVsi = {
> +        .vsi_mgr_id       = virtPort->u.virtPort8021Qbg.managerID,
> +        .vsi_type_version = virtPort->u.virtPort8021Qbg.typeIDVersion,
> +    };
> +    bool multicast = true;
> +    int ifindex;
> +
> +    if (ifaceGetIndex(true, ifname,&ifindex) != 0) {
> +        rc = 1;
> +        goto err_exit;
> +    }
> +
> +    portVsi.vsi_type_id[2] = virtPort->u.virtPort8021Qbg.typeID>>  16;
> +    portVsi.vsi_type_id[1] = virtPort->u.virtPort8021Qbg.typeID>>  8;
> +    portVsi.vsi_type_id[0] = virtPort->u.virtPort8021Qbg.typeID;
> +
> +    switch (virtPortOp) {
> +    case ASSOCIATE:
> +        op = PORT_REQUEST_ASSOCIATE;
> +        break;
> +    case DISASSOCIATE:
> +        op = PORT_REQUEST_DISASSOCIATE;
> +        break;
> +    default:
> +        macvtapError(VIR_ERR_INTERNAL_ERROR,
> +                     _("operation type %d not supported"), op);
> +        rc = 1;
> +        goto err_exit;
> +    }
> +
> +    rc = doPortProfileOpCommon(multicast, ifindex,
> +                               NULL,
> +&portVsi,
> +                               virtPort->u.virtPort8021Qbg.instanceID,
> +                               NULL,
> +                               PORT_SELF_VF,
> +                               op);
> +
> +err_exit:
> +
> +# endif /* IFLA_VF_PORT_MAX */
> +
> +    return rc;
> +}
> +
> +
> +# ifdef IFLA_VF_PORT_MAX
> +static int
> +getPhysfn(const char *linkdev,
> +          int32_t *vf,
> +          char **physfndev)
> +{
> +    int rc = 0;
> +    bool virtfn = false;
> +
> +    if (virtfn) {
> +
> +        // XXX: if linkdev is SR-IOV VF, then set vf = VF index
> +        // XXX: and set linkdev = PF device
> +        // XXX: need to use get_physical_function_linux() or
> +        // XXX: something like that to get PF
> +        // XXX: device and figure out VF index
> +
> +        rc = 1;
> +
> +    } else {
> +
> +        /* Not SR-IOV VF: physfndev is linkdev and VF index
> +         * refers to linkdev self
> +         */
> +
> +        *vf = PORT_SELF_VF;
> +        *physfndev = (char *)linkdev;
> +    }
> +
> +    return rc;
> +}
> +# endif /* IFLA_VF_PORT_MAX */
> +
> +static int
> +doPortProfileOp8021Qbh(const char *ifname,
> +                       const virVirtualPortProfileParamsPtr virtPort,
> +                       const unsigned char *vm_uuid,
> +                       enum virVirtualPortOp virtPortOp)
> +{
> +    int rc;
> +
> +# ifndef IFLA_VF_PORT_MAX
> +
> +    (void)ifname;
> +    (void)virtPort;
> +    (void)vm_uuid;
> +    (void)virtPortOp;
> +    macvtapError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                 _("Kernel VF Port support was missing at compile
> time."));
> +    rc = 1;
> +
> +# else /* IFLA_VF_PORT_MAX */
> +
> +    char *physfndev;
> +    unsigned char hostuuid[VIR_UUID_BUFLEN];
> +    int32_t vf;
> +    int op = PORT_REQUEST_ASSOCIATE;
> +    bool multicast = false;
> +    int ifindex;
> +
> +    rc = virGetHostUUID(hostuuid);
> +    if (rc)
> +        goto err_exit;
> +
> +    rc = getPhysfn(ifname,&vf,&physfndev);
> +    if (rc)
> +        goto err_exit;
> +
> +    if (ifaceGetIndex(true, physfndev,&ifindex) != 0) {
> +        rc = 1;
> +        goto err_exit;
> +    }
> +
> +    switch (virtPortOp) {
> +    case ASSOCIATE:
> +        op = PORT_REQUEST_ASSOCIATE;
> +        break;
> +    case DISASSOCIATE:
> +        op = PORT_REQUEST_DISASSOCIATE;
> +        break;
> +    default:
> +        macvtapError(VIR_ERR_INTERNAL_ERROR,
> +                     _("operation type %d not supported"), op);
> +        rc = 1;
> +        goto err_exit;
> +    }
> +
> +    rc = doPortProfileOpCommon(multicast, ifindex,
> +                               virtPort->u.virtPort8021Qbh.profileID,
> +                               NULL,
> +                               vm_uuid,
> +                               hostuuid,
> +                               vf,
> +                               op);
> +
> +    switch (virtPortOp) {
> +    case ASSOCIATE:
> +        ifaceUp(ifname);
> +        break;
> +    case DISASSOCIATE:
> +        ifaceDown(ifname);
> +        break;
> +    }
> +
> +err_exit:
> +
> +# endif /* IFLA_VF_PORT_MAX */
> +
> +    return rc;
> +}
>
>   /**
>    * associatePortProfile
>    *
>    * @macvtap_ifname: The name of the macvtap device
> + * @linkdev: The link device in case of macvtap
>    * @virtPort: pointer to the object holding port profile parameters
> - * @vf: virtual function number, -1 if to be ignored
>    * @vmuuid : the UUID of the virtual machine
>    *
>    * Associate a port on a swtich with a profile. This function
> @@ -740,15 +1431,14 @@ delMacvtap(const char *ifname,
>    */
>   static int
>   associatePortProfileId(const char *macvtap_ifname,
> +                       const char *linkdev,
>                          const virVirtualPortProfileParamsPtr virtPort,
> -                       int vf,
>                          const unsigned char *vmuuid)
>   {
>       int rc = 0;
> +
>       VIR_DEBUG("Associating port profile '%p' on link device '%s'",
>                 virtPort, macvtap_ifname);
> -    (void)vf;
> -    (void)vmuuid;
>
>       switch (virtPort->virtPortType) {
>       case VIR_VIRTUALPORT_NONE:
> @@ -756,11 +1446,14 @@ associatePortProfileId(const char *macvt
>           break;
>
>       case VIR_VIRTUALPORT_8021QBG:
> -
> +        rc = doPortProfileOp8021Qbg(macvtap_ifname, virtPort,
> +                                    ASSOCIATE);
>           break;
>
>       case VIR_VIRTUALPORT_8021QBH:
> -
> +        rc = doPortProfileOp8021Qbh(linkdev, virtPort,
> +                                    vmuuid,
> +                                    ASSOCIATE);
>           break;
>       }
>
> @@ -772,6 +1465,7 @@ associatePortProfileId(const char *macvt
>    * disassociatePortProfile
>    *
>    * @macvtap_ifname: The name of the macvtap device
> + * @linkdev: The link device in case of macvtap
>    * @virtPort: point to object holding port profile parameters
>    *
>    * Returns 0 in case of success, != 0 otherwise with error
> @@ -779,9 +1473,11 @@ associatePortProfileId(const char *macvt
>    */
>   static int
>   disassociatePortProfileId(const char *macvtap_ifname,
> +                          const char *linkdev,
>                             const virVirtualPortProfileParamsPtr
> virtPort)
>   {
>       int rc = 0;
> +
>       VIR_DEBUG("Disassociating port profile id '%p' on link device '%s'
> ",
>                 virtPort, macvtap_ifname);
>
> @@ -791,13 +1487,18 @@ disassociatePortProfileId(const char *ma
>           break;
>
>       case VIR_VIRTUALPORT_8021QBG:
> -
> +        rc = doPortProfileOp8021Qbg(macvtap_ifname, virtPort,
> +                                    DISASSOCIATE);
>           break;
>
>       case VIR_VIRTUALPORT_8021QBH:
> -
> +        rc = doPortProfileOp8021Qbh(linkdev, virtPort,
> +                                    NULL,
> +                                    DISASSOCIATE);
>           break;
>       }
>
>       return rc;
>   }
> +
> +#endif
> Index: libvirt-acl/src/util/macvtap.h
> ===================================================================
> --- libvirt-acl.orig/src/util/macvtap.h
> +++ libvirt-acl/src/util/macvtap.h
> @@ -72,6 +72,7 @@ int openMacvtapTap(const char *ifname,
>                      char **res_ifname);
>
>   void delMacvtap(const char *ifname,
> +                const char *linkdev,
>                   virVirtualPortProfileParamsPtr virtPortProfile);
>
>   # endif /* WITH_MACVTAP */
>
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
>
>    




More information about the libvir-list mailing list