[libvirt] [PATCH 00/11] Generic data stream handling

Fri Sep 25 07:58:51 UTC 2009

On Mon, Aug 24, 2009 at 09:51:03PM +0100, Daniel P. Berrange wrote:
> The following series of patches introduce support for generic data
> streams in the libvirt API, the remote protocol, client & daemon.
> 
> The public API takes the form of a new object virStreamPtr and
> methods to read/write/close it
> 
> The remote protocol was the main hard bit. Since the protocol
> allows multiple concurrent API calls on a single connection,
> this needed  to also allow concurrent data streams. It is also
> not acceptable for a large data stream to block other traffic
> while it is transferring.
> 
> Thus, we introduce a new protocol message type 'REMOTE_STREAM'
> to handle transfer for the stream data. A method involving a
> data streams starts off in the normal way with a REMOTE_CALL
> to the server, and a REMOTE_REPLY  response message. If this
> was successful, there now follows the data stream traffic.
> 
> For outgoing streams (data from client to server), the client
> will send zero or more REMOTE_STREAM packets containing the
> data with status == REMOTE_CONTINUE. These are asynchronous
> and not acknowledged by the server. At any time the server
> may send an async message with a type of REMOTE_STREAM and
> status of REMOTE_ERROR. This indicates to the client that the
> transfer is aborting at server request. If the client wishes
> to abort, it can send the server a REMOTE_STREAM+REMOTE_ERROR
> message. If the client finishes its data transfer, it will
> send a final REMOTE_STREAM+REMOTE_OK message, and the server
> will respond with the same. This full roundtrip handshake
> ensures any async error messages are guarenteed to be flushed
> 
> For incoming data streams (data from server to client), the
> server sends zero or more REMOTE_STREAM packets containing the
> data with status == REMOTE_CONTINUE. These are asynchronous
> and not acknowledged by the client. At any time the client
> may send an async message with a type of REMOTE_STREAM and
> status of REMOTE_ERROR. This indicates to the server that the 
> transfer is aborting at client request. If the server wishes
> to abort, it can send the server a REMOTE_STREAM+REMOTE_ERROR
> message. When the server finishes its data transfer, it will
> send a final REMOTE_STREAM+REMOTE_CONTINUE message ewith a 
> data length of zero (ie EOF). The client will then send a 
> REMOTE_STREAM+REMOTE_OK packet and the server will respond
> with the same. This full roundtrip handshake ensures any async
> error messages are guarenteed to be flushed
> 
> This all ensures that multiple data streams can be active in
> parallel, and with a maximum data packet size of 256 KB, no
> single stream can cause too much latency on the connection for
> other API calls/streams.

  Okay, this is very similar in principle with HTTP pipelining
with IMHO the same benefits and the same potential drawbacks.
A couple of things to check might be:
   - the maximum amount of concurrent active streams allowed,
     for example suppose you want to migrate in emergency
     all the domains out of a failing machine, some level of
     serialization may be better than say attempting to migrate
     all 100 domains at the same time. 10 parallel stream might
     be better, but we need to make sure the API allows to report
     such condition.
   - the maximum chunking size, but with 256k I think this is
     covered.
   - synchronization internally between threads to avoid deadlocks
     or poor performances, that can be very hard to debug, so I
     guess an effort should be provided to explain how things are
     designed internally.

  But this sounds fine in general.

> The only thing it does not allow for is one API method to use
> two or more streams. These may be famous last words, but I
> don't think that use case will be neccessary for any of our
> APIs...

  as long as the limitation is documented especially in the parts
of teh code where the assumption is made, sounds fine.

> The last 5 patches with a subject of [DEMO] are *NOT* intended
> to be committed to the repository. They merely demonstrate the
> use of data streams for a couple of hypothetical file upload
> and download APIs. Actually they were mostly to allow me to
> test the code streams code without messing around with the QEMU
> migration code.
> 
> The immediate use case for this data stream code is Chris' QEMU
> migration patchset.
> 
> The next use case is to allow serial console access to be tunnelled
> over libvirtd, eg to make  'virsh console GUEST' work remotely.
> This use case is why I included the support for non-blocking data
> streams and event loop integration (not required for Chris'
> migration use case)

  Okay, next to individual patches reviews,

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/