[libvirt] [PATCH] blockjob: fix block-stream bandwidth race

Eric Blake eblake at redhat.com
Thu Apr 26 15:29:14 UTC 2012


On 04/26/2012 09:00 AM, Jiri Denemark wrote:
> On Wed, Apr 25, 2012 at 17:04:25 -0600, Eric Blake wrote:
>> With RHEL 6.2, virDomainBlockPull(dom, dev, bandwidth, 0) has a race
>> with non-zero bandwidth: there is a window between the block_stream
>> and block_job_set_speed monitor commands where an unlimited amount
>> of data was let through, defeating the point of a throttle.
>>
>> This race was first identified in commit a9d3495e, and libvirt was
>> able to reduce the size of the window for that race.  In the meantime,
>> the qemu developers decided to fix things properly; per this message:
>> https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03793.html
>> the fix will be in qemu 1.1, and changes block-job-set-speed to use
>> a different parameter name, as well as adding a new optional parameter
>> to block-stream, which eliminates the race altogether.
>>
>> Since our documentation already mentioned that we can refuse a non-zero
>> bandwidth for some hypervisors, I think the best solution is to do
>> just that for RHEL 6.2 qemu, so that the race is obvious to the user.
>> Meanwhile the code must be fixed to honor actual qemu 1.1 naming.
> 
> OK, this seems reasonable, but I'm concerned about backward
> compatibility. What if someone is happily using this now with a non-zero
> bandwidth even though it's racy? With this change, such usage will fail.

Since there is no upstream qemu release with the race present, the
_only_ time you can get a user affected by this change in semantics is
if you are using qemu on RHEL 6.2; furthermore, that version of qemu
only supported block pull for the QED file format, which is not very
popular yet.  And given that only RHEL 6.2 qemu is affected, anyone that
uses the RHEL 6.2 libvirt will not have any regression so long as this
patch is not backported to that version of libvirt.  I think it is a
small enough set of people to be worth the tightened semantics.

> 
> ...
>> diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
>> index f3cdcdd..dc02964 100644
>> --- a/src/qemu/qemu_monitor.h
>> +++ b/src/qemu/qemu_monitor.h
>> @@ -530,8 +530,7 @@ typedef enum {
>>      BLOCK_JOB_ABORT = 0,
>>      BLOCK_JOB_INFO = 1,
>>      BLOCK_JOB_SPEED = 2,
>> -    BLOCK_JOB_SPEED_INTERNAL = 3,
>> -    BLOCK_JOB_PULL = 4,
>> +    BLOCK_JOB_PULL = 3,
>>  } BLOCK_JOB_CMD;
> ...
> 
> Why do we use explicit numbering when this is only used internally?

No particular reason; I can trim that when pushing, if you'd like.

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120426/6072c746/attachment-0001.sig>


More information about the libvir-list mailing list