[libvirt] [PATCH] qemu: fix msg could be a wild pointer in qemuMonitorIOProcess()

zhuweilun zhuweilun at huawei.com
Wed Jun 6 03:46:07 UTC 2018



在 2018/6/5 15:10, Peter Krempa 写道:
> On Tue, Jun 05, 2018 at 10:14:39 +0800, Shannon Zhao wrote:
>> From: Weilun Zhu <zhuweilun at huawei.com>
>>
>> As qemuMonitorJSONIOProcess() will unlock the qemu monitor, there is
>> some extreme situation, eg qemu send message to monitor twice in a short
>> time, where the local viriable 'msg' of qemuMonitorIOProcess() could be
>> a wild point:
>>
>> 1. qemuMonitorSend() assign mon->msg to parameter 'msg', which is alse a
>> local variable of its caller qemuMonitorJSONCommandWithFd(), cause
>> eventloop to send message to monitor, then wait condition.
>> 2. qemu send message to monitor for the first time immediately.
>> 3. qemuMonitorIOProcess() is called, then wake up the qemuMonitorSend()
>> thread, but the qemuMonitorSend() thread stuck for a while, which means
> 
> If you wake up a tread via signalling a condition it _must_ have the
> mutex locked prior to executing ....
> 
> Note that after waking up from virCondWait you have the mutex which was
> passed to it locked.
> 

Yes, it _must_ have the mutex locked prior to executing, but there is still
a chance that the qemuMonitorIO() hold the mutex faster.

I mean virCondWait wants to wake up after qemuMonitorIOProcess() broadcast,
but it stuck for a while as cpu pressure or scheduleed out or some other
reasons. And in such short time, before virCondWait try to hold the mutex,
qemu has sent message again, qemuMonitorIO() has been called again,
and hold the mutex successfully. So virCondWait will still wait the mutex
even qemuMonitorIOProcess() has broadcast.

>> the qemu monitor is still unlocked.
>> 4. qemu send message to monitor for the second time, such as RTC_CHANGE
>> event
>> 5. qemuMonitorIOProcess() is called, the local viriable 'msg' is
>> assigned to mon->msg.
>> 6. qemuMonitorIOProcess() call qemuMonitorJSONIOProcess() to deal with
>> the message
>> 7. qemuMonitorJSONIOProcess() unlock the qemu monitor, qemuMonitorSend()
>> thread get the lock and free the mon->msg, assign mon->msg to NULL.
> 
> The monitor is unlocked in qemuMonitorIO() after finishing processing
> from the event loop. There is no point where qemuMonitorIOProcess()
> would not hold the mutex locked.
> 

qemuMonitorJSONIOProcess() is called by qemuMonitorIOProcess() to deal with
the message, qemuMonitorJSONIOProcess()->qemuMonitorJSONIOProcessLine()->
qemuMonitorJSONIOProcessEvent->qemuMonitorEmitEvent()->QEMU_MONITOR_CALLBACK

#define QEMU_MONITOR_CALLBACK(mon, ret, callback, ...)          \
    do {                                                        \
        virObjectRef(mon);                                      \
        virObjectUnlock(mon);                                   \
        if ((mon)->cb && (mon)->cb->callback)                   \
            (ret) = (mon)->cb->callback(mon, __VA_ARGS__,       \
                                        (mon)->callbackOpaque); \
        virObjectLock(mon);                                     \
        virObjectUnref(mon);                                    \
    } while (0)

the macro QEMU_MONITOR_CALLBACK will do virObjectUnlock(mon). So virCondWait
can hold the mutex finially, then mon->msg assgined to NULL and be freed,
while qemuMonitorJSONIOProcess() is wating the mutex after its callback.

> The event loop or the thread waiting for the message to be processed can
> start executing only after unlocking the mutex.
> 

Sure

>> so the local viriable 'msg' of qemuMonitorIOProcess() is a wild pointer
>> now.
>>
>> AFAIK, it is not harmful to call again virCondBroadcast() while msg is a
>> wild pointer, but just in case,  we fix it in this patch.
>> ---
>>  src/qemu/qemu_monitor.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
>> index 43f1d2f..464f200 100644
>> --- a/src/qemu/qemu_monitor.c
>> +++ b/src/qemu/qemu_monitor.c
>> @@ -454,7 +454,7 @@ qemuMonitorIOProcess(qemuMonitorPtr mon)
>>  #if DEBUG_IO
>>      VIR_DEBUG("Process done %d used %d", (int)mon->bufferOffset, len);
>>  #endif
>> -    if (msg && msg->finished)
>> +    if (msg && msg == mon->msg && msg->finished)
> 
> qemuMonitorIOProcess is executed when the monitor is locked so this is
> impossible to happen. If the mon->msg object would be overwritten at
> this point in any way, this certainly is not the correct fix as
> something overwrote the pointer while the lock was held.
> 

I struct like following:

@@ -1036,7 +1036,7 @@ qemuMonitorNextCommandID(qemuMonitorPtr mon)
     return id;
 }

-
+static int debugon = 0; // set to 1 in gdb
 int
 qemuMonitorSend(qemuMonitorPtr mon,
                 qemuMonitorMessagePtr msg)
@@ -1064,6 +1064,11 @@ qemuMonitorSend(qemuMonitorPtr mon,
                            _("Unable to wait on monitor condition"));
             goto cleanup;
         }
+        if (debugon) {
+            virObjectUnlock(mon);
+            sleep(5); // give a chance to qemu send msssage again
+            virObjectLock(mon);
+        }
     }

     if (mon->lastError.code != VIR_ERR_OK) {

then gdb attach to libvirtd, break qemuMonitorSend and qemuMonitorIOProcess,
virsh shutdown $VM, set debugon = 1, continue to qemuMonitorIOProcess for the
first time, then continue to wait qemu to send SHUTDOWN event in the 5 second,
after msg assgined to mon->msg in qemuMonitorIOProcess, wait the 5 second
finished. Then continue to the "if (msg && msg->finished)", the msg is a wild
pointer now while mon->msg is NULL

>>          virCondBroadcast(&mon->notify);
>>      return len;
>>  }
>> -- 
>> 1.8.3.1
>>
>>
>> --
>> libvir-list mailing list
>> libvir-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list




More information about the libvir-list mailing list