[libvirt] [PATCH 00/11] Generic data stream handling

Tue Sep 29 14:46:13 UTC 2009

FYI,

the data streams patches are now committed

Daniel

On Fri, Sep 25, 2009 at 09:58:51AM +0200, Daniel Veillard wrote:
> On Mon, Aug 24, 2009 at 09:51:03PM +0100, Daniel P. Berrange wrote:
> > The following series of patches introduce support for generic data
> > streams in the libvirt API, the remote protocol, client & daemon.
> > 
> > The public API takes the form of a new object virStreamPtr and
> > methods to read/write/close it
> > 
> > The remote protocol was the main hard bit. Since the protocol
> > allows multiple concurrent API calls on a single connection,
> > this needed  to also allow concurrent data streams. It is also
> > not acceptable for a large data stream to block other traffic
> > while it is transferring.
> > 
> > Thus, we introduce a new protocol message type 'REMOTE_STREAM'
> > to handle transfer for the stream data. A method involving a
> > data streams starts off in the normal way with a REMOTE_CALL
> > to the server, and a REMOTE_REPLY  response message. If this
> > was successful, there now follows the data stream traffic.
> > 
> > For outgoing streams (data from client to server), the client
> > will send zero or more REMOTE_STREAM packets containing the
> > data with status == REMOTE_CONTINUE. These are asynchronous
> > and not acknowledged by the server. At any time the server
> > may send an async message with a type of REMOTE_STREAM and
> > status of REMOTE_ERROR. This indicates to the client that the
> > transfer is aborting at server request. If the client wishes
> > to abort, it can send the server a REMOTE_STREAM+REMOTE_ERROR
> > message. If the client finishes its data transfer, it will
> > send a final REMOTE_STREAM+REMOTE_OK message, and the server
> > will respond with the same. This full roundtrip handshake
> > ensures any async error messages are guarenteed to be flushed
> > 
> > For incoming data streams (data from server to client), the
> > server sends zero or more REMOTE_STREAM packets containing the
> > data with status == REMOTE_CONTINUE. These are asynchronous
> > and not acknowledged by the client. At any time the client
> > may send an async message with a type of REMOTE_STREAM and
> > status of REMOTE_ERROR. This indicates to the server that the 
> > transfer is aborting at client request. If the server wishes
> > to abort, it can send the server a REMOTE_STREAM+REMOTE_ERROR
> > message. When the server finishes its data transfer, it will
> > send a final REMOTE_STREAM+REMOTE_CONTINUE message ewith a 
> > data length of zero (ie EOF). The client will then send a 
> > REMOTE_STREAM+REMOTE_OK packet and the server will respond
> > with the same. This full roundtrip handshake ensures any async
> > error messages are guarenteed to be flushed
> > 
> > This all ensures that multiple data streams can be active in
> > parallel, and with a maximum data packet size of 256 KB, no
> > single stream can cause too much latency on the connection for
> > other API calls/streams.
> 
>   Okay, this is very similar in principle with HTTP pipelining
> with IMHO the same benefits and the same potential drawbacks.
> A couple of things to check might be:
>    - the maximum amount of concurrent active streams allowed,
>      for example suppose you want to migrate in emergency
>      all the domains out of a failing machine, some level of
>      serialization may be better than say attempting to migrate
>      all 100 domains at the same time. 10 parallel stream might
>      be better, but we need to make sure the API allows to report
>      such condition.
>    - the maximum chunking size, but with 256k I think this is
>      covered.
>    - synchronization internally between threads to avoid deadlocks
>      or poor performances, that can be very hard to debug, so I
>      guess an effort should be provided to explain how things are
>      designed internally.
> 
>   But this sounds fine in general.
> 
> > The only thing it does not allow for is one API method to use
> > two or more streams. These may be famous last words, but I
> > don't think that use case will be neccessary for any of our
> > APIs...
> 
>   as long as the limitation is documented especially in the parts
> of teh code where the assumption is made, sounds fine.
> 
> > The last 5 patches with a subject of [DEMO] are *NOT* intended
> > to be committed to the repository. They merely demonstrate the
> > use of data streams for a couple of hypothetical file upload
> > and download APIs. Actually they were mostly to allow me to
> > test the code streams code without messing around with the QEMU
> > migration code.
> > 
> > The immediate use case for this data stream code is Chris' QEMU
> > migration patchset.
> > 
> > The next use case is to allow serial console access to be tunnelled
> > over libvirtd, eg to make  'virsh console GUEST' work remotely.
> > This use case is why I included the support for non-blocking data
> > streams and event loop integration (not required for Chris'
> > migration use case)
> 
>   Okay, next to individual patches reviews,
> 
> Daniel
> 
> -- 
> Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
> daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
> http://veillard.com/ | virtualization library  http://libvirt.org/

-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|