[Libvir] RFC [0/3]: Re-factor QEMU daemon protocol to use XDR

Fri Mar 23 11:29:34 UTC 2007

On Fri, Mar 23, 2007 at 10:02:33AM +0000, Richard W.M. Jones wrote:
> As you say, no appreciable difference in code side before and after the 
> change.
> 
> It's interesting to note what SunRPC gives you over XDR (which you have 
> also noted in your emails):
> 
>  - serial numbers to correlate client requests and server responses (in 
> SunRPC these are called 'XID's)

Unfortunately they crippled their use by only providing a single 'rm_xid'
field. So the only way to do tracking is to mandate that the rm_xid of the
method reply matches the rm_xid of the method call. This is why it is
impossible to do asynchronous notifications from the server - ie let the
server send you a message without any corresponding message having been
received from the client. 

To do this correctly requires 2 independantly incremented sequences, one
owned by the server & the other by the client. For method replies you need
to have a completely separate field to indicate the serial number of the
message you are replying to. I don't see any easy way to work around this
limitation in SunRPC wire format :-(

>  - program numbers and versioning

Not very different to replicate though - and in a less onerous way than
SunRPC approach. It would seem that for every new version you need to provide
a new set of all the marshalling methods defined by your program, even if 
you're only adding one new one. 

>  - procedure numbers (same as your qemud_packet_{client,server} _data_type)
> 
> In fact the header of the SunRPC request and reply is just an 
> XDR-encoded data structure looking like this:
> 
> struct rpc_msg {
>         u_long                  rm_xid;	// XID
> 					// 0 = CALL, 1 = REPLY
>         enum msg_type           rm_direction;
>         union {
>                 struct call_body RM_cmb;
>                 struct reply_body RM_rmb;
>         } ru;
> };
> 
> struct call_body {
>         u_long cb_rpcvers;      	// always 2
>         u_long cb_prog;			// program number
>         u_long cb_vers;			// program version
>         u_long cb_proc;			// procedure number
>         struct opaque_auth cb_cred;	// credentials - ignore
>         struct opaque_auth cb_verf;	// verifier - ignore
> };
> 
> struct reply_body is a bit more complex - see <rpc/rpc_msg.h>.
> 
> Actually there are functions provided to generate the call and reply 
> headers (xdr_callhdr and xdr_replymsg) so we could use those directly to 
> create SunRPC-compatible headers.  Advantages:  Wire sniffing tools 
> which grok SunRPC will still work (Wireshark does NFS, so presumably it 
> does general Sun/ONC-RPC; Systemtap apparently supports it too).  There 
> are various other efforts to run SunRPC over TLS (including an OCaml 
> one!) which we may still be able to interoperate with.

Hmm, interesting point

> In addition we have provided encryption and authentication.  Your patch 
> punts on that - I understand why.

Yep, that's a side issue for now - the existing code you've written could
quite easily be ported over, since pretty much just have to replace the
socket connection setup code & the read/write calls. So there's not really
all that much to compare between the two impls.

> About strings: char foo[MAX] (as you pointed out) could be replaced by 
> variable-length strings.  There are at least three catches: (1) Who 
> owns/frees the strings? - I was never able to work that out.  (2) NULL 
> is not supported directly, you have to use the option-looks-like-
> a-pointer type (string *).  (3) If the client and server don't trust 
> each other then you need to check the length of incoming strings 
> carefully to make sure that they aren't huge and potentially could cause 
> a DoS.

In the XDR definition you can give a maximum size eg

    string  foo<>;

Allows an unlimited size string (MAX_INT32), while

   string foo<500>;

Would ensure that the generated XDR routines will refuse a string larger
than 500 bytes.

The most compelling reason for  string foo<500> instead of char foo[500]
is that every array element is encoded on a 4 byte boundary, so your 1 byte
chars end up taking 4 bytes each. The string datatype has a 4 byte length
header, following by single bytes per character. A minor detail, but worth
bearing in mind.

> 
> >  1. Add an extra initial message type to do a major/minor version number
> >     exchange. Based on the negotiated versions, the server can activate
> >     or deactivate particular requests it allows from the client.
> >
> >  2. Add in the URI parsing routines to the qemud_interal.c driver, but
> >     only allow use of QEMU initially.
> >
> >  3. Add in the TLS + SSH  client & server socket setup code to give an
> >     authenticated & secured remote channel
> >
> >  4. Re-factor qemud/dispatch.c so that it calls into the regular libvirt
> >     API for the Xen case, but keeps calling qemud/driver.c for the QEMU
> >     case.
> 
> You'd definitely want to call it something other than 'qemu' ...

Yes, the binary would obviously be renamed to just 'libvirtd', not that the
user would really see the difference, becuase we already called the init
script libvirtd to hide the details.

Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=|