[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] PATCH: Fix multiple bugs in RPC handling



"Daniel P. Berrange" <berrange redhat com> wrote:

> A number of bugs conspired together to cause some nasty problems when
> a QEMU vm failed to start
>
>  - vm->monitor was not initialized to -1, so when a VM failed to start
>    the vm->monitor was just '0', and thus we closed FD 0 (libvirtd's stdin)
>
>  - The next client to connect got FD 0 as its socket
>
>  - The first bug struck again, causing the client to be closed even
>    though libvirt thought it was still open
>
>  - libvirtd now polle on FD=0, which gave back POLLNVAL because it was
>    closed
>
>  - event.c was not looking for POLLNVAL so it span 100% cpu when this
>    happened, instead of invoking the callback with an error code
>
>  - virsh was not cleaning up the priv->watiDispatch call upon I/O errors,
>    so virsh then hung when doing virConenctClose

It could also segfault, and it was easy to make it do that
for me, every third client call.  For reference, here's what I did:

LIBVIRT_DEBUG=1 qemud/libvirtd > log 2>&1 &
cat <<\EOF > e.xml
<domain type='qemu'>
  <name>E</name>
  <uuid>d7a5fdbd-cdaf-9455-926a-d65c16db1809</uuid>
  <memory>219200</memory>
  <currentMemory>219200</currentMemory>
  <vcpu>2</vcpu>
  <os>
    <type arch='i686' machine='pc'>hvm</type>
    <boot dev='cdrom'/>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='cdrom'>
      <source file='NO_SUCH_FILE'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
    </disk>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
  </devices>
</domain>
EOF

  $ src/virsh create e.xml
  libvir: Remote error : no call waiting for reply with serial 3
  error: failed to connect to the hypervisor
  [Exit 1]
  $ src/virsh create e.xml
  libvir: Remote error : no call waiting for reply with serial 0
  error: failed to connect to the hypervisor
  [Exit 1]
  $ src/virsh create e.xml
  libvir: Remote error : server closed connection
  error: Failed to create domain from e.xml

  zsh: segmentation fault  src/virsh create e.xml

FYI, that was due to this code

    remote_internal.c:6319, while (tmp && tmp->next)

where "tmp" is bogus because priv->waitDispatch was freed.

Note that this was probably easier for me than most,
since I have this in my environment:

  export MALLOC_PERTURB_=$(($RANDOM % 255 + 1))

> This patch does 3 things
>
>  - Treats POLLNVAL as VIR_EVENT_HANDLE_ERROR, so the callback gets
>    to see the error & de-registers the client from the event loop
>  - Add the missing initialization of vm->monitor
>  - Fix remote_internal.c handling of I/O errors

ACK.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]