[libvirt] [Fwd: [PATCH v9] add 802.1Qbh and 802.1Qbg handling]

Stefan Berger stefanb at linux.vnet.ibm.com
Thu May 27 21:10:05 UTC 2010


On Thu, 2010-05-27 at 17:43 +0200, Arnd Bergmann wrote:
> On Thursday 27 May 2010, Stefan Berger wrote:
> > On Thu, 2010-05-27 at 15:37 +0200, Arnd Bergmann wrote:
> 
> > > I'm even more confused now. Why should the response be different
> > > from the response we get from the kernel? What's different on the
> > > sending side other than the PID?
> > > Also, what is the RTMGRP_LINK argument used for? I thought we don't
> > > need multicast any more because we don't target kernel and lldpad
> > > in the same message but only one of them.
> > 
> > Fact is that if I set RTMGRP_LINK to 0 here on libvirt only side, the
> > dummy server doesn't get a message. If I set it to 0 on both side, the
> > dummy server also doesn't get a message. I think it's necessary for
> > user-space-to-user-space communication, but I am also only learning
> > about these types of sockets while I 'go'.
> 
> AFAICT, RTMGRP_LINK makes it a multicast message, which means that
> anyone can receive it, which a unicast message will only be sent
> to the task that has a matching pid.

Right. So, yes, if I read the pid of lldpad from /var/run/lldpad.pid
then I can direct the netlink message and nlComm() becomes common code
for talking to the kernel or another userspace process.

> 
> > > > You mean that's not defined in the (pre-)standard?
> > > 
> > > Yes, the Qbg wire protocol has no need for this, because the
> > > messages are only sent after the state has changed, we never
> > > see a VDP message with an incomplete status in there, so there
> > > is no need to specify it in Qbg, but we need something in the netlink
> > > protocol.
> > 
> > Yes, I would suggest to mimic 802.1Qbh in this case.
> 
> Like what? The first 256 result numbers in the netlink protocol
> are meant to be the same as the ones in the wire protocol, the
> next 256 are for the Qbh protocol.

I did not look carefully and differentiate between error codes of
802.1Qbg or .1Qbh but  took them as shared between both technologies.

> We could of course define yet another range for the inprogress
> result and possibly future extensions, starting at 512 if you
> insist on requiring the IFLA_PORT_RESPONSE.
> 

No, I'll fake it for now with the INPROGRESS value of 802.Qbh.

> I originally wanted to defer the response until we hear back
> from the switch, but that may take a really long time
> (over a minute with all the VDP timeouts). That would be
> cleaner IMHO, but you may not want to wait that long in libvirt.
> 

802.1Qbh is polling for max. 10 seconds. So we need to tune this
to a even higher value for 802.1Qbg ?

> > > > > what we should return there? Should we possibly just leave out
> > > > > IFLA_PORT_RESPONSE in order to signal INPROGRESS, as in not clear yet?
> > > > 
> > > > We want to be able to signal failure in case the switch setup failed so
> > > > that we don't have a malfunctioning network interface where the user
> > > > then doesn't know what to debug. In case of failure we simply wouldn't
> > > > start the VM or hotplug the interface and return an indication that
> > > > something went wrong during the negotiation with the switch.
> > > 
> > > So leaving out IFLA_PORT_RESPONSE would work for that, right?
> > 
> > We need to poll for the current status. At least in 802.1Qbh case using
> > an RTM_GETLINK type of message. Don't know how we could leave out
> > IFLA_PORT_RESPONSE...
> 
> The response can simply contain all attributes of the request
> but not contain an IFLA_PORT_RESPONSE attribute inside of the IFLA_VF_PORT.
> 

Yes, understood now.


> 
> > > attribute for Qbg, which is probably the best approximation we can get.
> > > I think we should just mandate this for the request, and let lldpad
> > > decide on the fake vf number, which it returns to getPortProfileStatus.
> > > 
> > 
> > Just be a bit more clear about the exact parameter to pass.
> 
> I'd suggest doing a request with
> 
> vfinfo_attr[IFLA_VF_MAC] = { .vf = -1u, .mac = OUR_MAC };
> vfinfo_attr[IFLA_VF_VLAN] = { .vf = -1u, vlan = OUT_VLAN };
> vf_port_attr[IFLA_PORT_VF] = NULL; /* don't send this at all */
> vf_port_attr[IFLA_PORT_RESPONSE] = NULL; /* also don't send this */
> vf_port_attr[IFLA_PORT_*] = WHATEVER_WE_NEED;
> 
> and waiting for the receiver to choose a free VF and ack this message with
> 

I think the typical ACK in a netlink message on a RTM_SETLINK is just
an error code indication of whether the message was successfully process
or now. The subsequent RTM_GETLINK would then return what you show here.

> vfinfo_attr[IFLA_VF_MAC] = { .vf = CHOSEN_VF, .mac = OUR_MAC };
> vfinfo_attr[IFLA_VF_VLAN] = { .vf = CHOSEN_VF, vlan = OUT_VLAN };
> vf_port_attr[IFLA_PORT_VF] = { CHOSEN_VF }; /* to identify the vf_port_attr */
> vf_port_attr[IFLA_PORT_RESPONSE] = NULL; /* no response yet */
> vf_port_attr[IFLA_PORT_RESPONSE] = PORT_RESPONSE_INPROGRESS; /* alternative,
> 							 if you prefer */
> vf_port_attr[IFLA_PORT_*] = WHATEVER_WE_REQUESTED;
> 
> I don't care which of the two responses we put in there, we just need to
> make a decision here since we forgot to define one before.
> 

I don't do much with the returned vf so for now I don't read it.

> When waiting for the request to complete, we can then check for the
> specific vf returned from lldpad (or the kernel, if we ever need to go there).
> 

I have a v11 that does hopefully most of the above now correctly. I will
send it to you and Scott privately first before generating more noise on
the list.

    Stefan

> 	Arnd





More information about the libvir-list mailing list