[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH v1 1/3] adding unplug_timeout QEMU conf





On 8/30/19 8:15 AM, Christophe de Dinechin wrote:
Daniel P. Berrangé writes:

The reason for this timeout is that we originally promised something
that we cannot deliver - a synchronous device detach API, while the
operation itself is asynchronous. I'm not a fan of exposing it and
making it configurable.
I'm especially *not* a fan because the commit messages says this is
a problem on certain architectures. Since we know what those arches
are, we should use a larger timeout for those arches out of the box.
Requiring admin to set a config param to fix the architectures is
super unpleasant out of the box experiance.
True, but also notice that 5 seconds is also already close to the
attention span time limit for users [1]. So increasing it to 10s might
bring people to believe things are stuck, unless you provide some sort
of feedback that this is normal.

https://www.nngroup.com/articles/response-times-3-important-limits/

Interesting link, thanks.

About the user feedback due to long response delay: we're already
breaking this with the setvcpus command, at least with PowerPC
guests and a lot of vcpus being unplugged. Here's an example in
which I am able to complete the command without kicking the
timeout error (guest is idle, vcpu unplug is fast in this case):

--- guest booted with 1 vcpu - added 39 extra vcpus. Operation takes
a second, it's fast  ---

[danielhb kop5 libvirt]$ sudo ./run tools/virsh setvcpus vcpus-test 40 --live

[danielhb kop5 libvirt]$
[danielhb kop5 libvirt]$

--- removing them back ----

[danielhb kop5 libvirt]$ time sudo ./run tools/virsh setvcpus vcpus-test 1 --live


real    0m21.695s
user    0m0.119s
sys    0m0.000s
[danielhb kop5 libvirt]$

This happens in PowerPC because the timeout is being considered not for the
whole operation, but per device. Since I'm unplugging 39 devices and the
5 seconds timeout is refreshed for every operation, in theory the user
can wait close to 39*5  seconds with the terminal frozen.

Now, if we are to adhere to such UX standards (IMO, we should), I propose
the following:

- short term: increase PowerPC timeout to 10 seconds per device. Following
the UX guideline above, this is the limit we can go without warning the
user about the delay;

- short term: for PowerPC guests, tune 'setvcpus' message to warn the user
that the operation can take some time to complete;

These 2 are simples changes and I can get it done for the next release
without too much trouble.

- mid/long term: I can look into the PowerPC guest implementation, see if there
are device_del events being fired up in QMP and implement a better UX with
more information about how the process is going. Something like "vcpu 1 out
of 30 unplugged", "vcpu 2 out of 30 unplugged", or a progress bar, or
whatever makes more sense to give the user a feeling of operation ongoing.


Note that I'm suggesting PowerPC only changes due to what Daniel said
earlier - we can't impact other users due to something that, at first glance,
only PowerPC does different. I have a hunch that we should do for all
archs, but I can't defend this claim without testing this in x86 at least.
These short-term changes are easy to make it across the board though,so it's
just a matter of removing "if PowerPC" in these changes.


What do you think?



Thanks,



DHB




Regards,
Daniel
--
|:https://berrange.com       -o-https://www.flickr.com/photos/dberrange  :|
|:https://libvirt.org          -o-https://fstop138.berrange.com  :|
|:https://entangle-photo.org     -o-https://www.instagram.com/dberrange  :|
--
Cheers,
Christophe de Dinechin (IRC c3d)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]