[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[libvirt] Re: [Qemu-devel] [PATCH 1/6] Allow multiple monitor devices (v2)



Anthony Liguori wrote:
Avi Kivity wrote:
Suppose you have a command which changes the meaning of a notification. If a notification arrives before the command completion, then it happened before the command was executed.

If you want to make that reliable, you cannot have multiple monitors.

Right.

Since you can mask notifications, there can be an arbitrarily long time between notification and the event happening. Socket buffering presents the same problem. Image:

Monitor 1:
time 0: (qemu) hotadd_cpu 2
time 1: (qemu) hello world <no new line>
time 5: <new line>
time 6: notification: cpu 2 added
time 6: (qemu)

Monitor 2:
time 3: (qemu) hotremove_cpu 2
time 4: (qemu)
time 5: notification: cpu 2 removed
time 6: (qemu)

So to eliminate this, you have to ban multiple monitors.

Well, not ban multiple monitors, but require that for non-racy operation commands and notifications be on the same session.

We can still debug on our dev-only monitor.

Fine, let's say we did that, it's *still* racy because at time 3, the guest may hot remove cpu 2 on it's own since the guests VCPUs get to run in parallel to the monitor.

A guest can't hotremove a vcpu. It may offline a vcpu, but that's not the same.

Obviously, if both the guest and the management application can initiate the same action, then there will be races. But I don't think that's how things should be -- the guest should request a vcpu to be removed (or added), management thinks and files forms in triplicate, then hotadds or hotremoves the vcpu (most likely after it is no longer needed).

With the proper beaurocracy, there is no race.


And even if you somehow eliminate the issue around masking notifications, you still have socket buffering that introduces the same problem.

If you have one monitor, the problem is much simpler, since events travelling in the same direction (command acknowledge and a notification) cannot be reordered. With a command+wait, the problem is inherent.


The best you can do is stick a time stamp on a notification and make sure the management tool understands that the notification is reflectively of the state when the event happened, not of the current state.

Timestamps are really bad. They don't work at all if the management application is not on the same host. They work badly if it is on the same host, since commands and events will be timestamped at different processes.

FWIW, this problem is not at all unique to QEMU and is generally true of most protocols that support an out-of-band notification mechanism.


command+wait makes it worse.  Let's stick with established practice.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]