On Wed, Aug 07, 2019 at 12:39:09 +0200, Tomáš Golembiovský wrote: > On Thu, Aug 01, 2019 at 08:37:03AM -0500, Jonathon Jongsma wrote: > > This series adds several bits of guest information provided by a new API > > function virDomainGetGuestInfo(). There is an implementation for qemu using the > > guest agent. In particular, it adds information about logged-in users, guest > > OS, and timezone. > > > > I had previously submitted a patch series with a dedicated API function for > > querying guest users that returned an array of structures describing the users. > > It was suggested that I convert his to a more futureproof design using typed > > parameters and also combine it with other bits of guest information similar to > > virDomainListGetStats(). In fact, there were several bugs that requested this > > information be added to the 'bulk stats' API, but Peter Krempa suggested adding > > a new API for all data queried from the guest agent (see > > https://bugzilla.redhat.com/show_bug.cgi?id=1705514). This is that API > > proposal. > > The reason we suggested 'bulk stats' approach is so that we can retrieve > information for all VMs in single call. This is not just about laziness > on side of management app, it is much easier for libvirt. We can either I don't want to crush your dreams, but for the VM bulk stats API theres nothing fancy that libvirt does here. We just fetch a list of all VMs and then sequentially in a loop fill in all the stats. > fetch the info for all VMs one-by-one (which can take too long), or > resort to massive threading. On the other hand it seems that libvirt Actually in some cases this might bring better results in cases when one VM is not responding. > with its async jobs can handle this in single thread. I am not libvirt Yes it's a single thread and a loop. > expert so please correct me if I am making wrong assumptions here. Also, > libvirt has pretty good knowledge whether the guest agent is running or > not. From application side we cannot get this information reliably and > we need to resort to trial-error approach. Well, we know whether the guest agent channel socket is connected from guest's side, but we also expose this information in the XML file and there's also an event which is fired always when the state of the guest agent changes. This is possible via notifications from qemu via the virtio channel. > > It follows the stats API quite closely, and tries to return data in > > similar ways (for example, the "users.N.*" field name scheme was inspired by > > various stats fields). > > > > I plan to follow this series up with a patch that adds fsinfo to the returned > > information, but wanted to get comments on this approach now. > > Apart from the above I have two other concerns. > > With how the API call is designed you cannot pick which commands to run > (you always get them all). With the number of included commands growing > in the future the time to complete the call will grow as well and could > just take too long. Also, if you run the call periodically you always > don't want to run all the commands. For example, you don't need to check > the os-info as often as list of logged in users. This was clarified in a different reply. > You cannot set the timeout on the guest agent commands. Instead you > block on them. As described in bug , rogue guest can potentially > block your call indefinitely. If you plan to address this problem in a > more general manner later that's probably fine too. Libvirt's guest-agent APIs actually have some built-in timeout so that a rouge agent can't block forever, but this timeout is not really configurable. A good point would be probably to have an argument for the API to be able to use an arbitrary timeout for the calls. In general I'm not really sure that there's benefit in having the API return the stats for all guest OSes rather than just one. In my design for the VM stats API I think I went a bit too far in this case and the benefits of returning stats for all VMs compared to returning all data for a single VM are worth that. What was worth in that implementation was to be able to query multiple types of stats at once. I'm willing to discuss this further though.
Description: PGP signature