[libvirt] [PATCH] util: do a better job of matching up pids with their binaries

Kyle Mestery (kmestery) kmestery at cisco.com
Tue Oct 30 13:13:21 UTC 2012


On Oct 29, 2012, at 5:50 PM, Laine Stump <laine at laine.org> wrote:
> This patch resolves: https://bugzilla.redhat.com/show_bug.cgi?id=871201
> 
> If libvirt is restarted after updating the dnsmasq or radvd packages,
> a subsequent "virsh net-destroy" will fail to kill the dnsmasq/radvd
> process.
> 
> The problem is that when libvirtd restarts, it re-reads the dnsmasq
> and radvd pidfiles, then does a sanity check on each pid it finds,
> including checking that the symbolic link in /proc/$pid/exe actually
> points to the same file as the path used by libvirt to execute the
> binary in the first place. If this fails, libvirt assumes that the
> process is no longer alive.
> 
> But if the original binary has been replaced, the link in /proc is set
> to "$binarypath (deleted)" (it literally has the string " (deleted)"
> appended to the link text stored in the filesystem), so even if a new
> binary exists in the same location, attempts to resolve the link will
> fail.
> 
> In the end, not only is the old dnsmasq/radvd not terminated when the
> network is stopped, but a new dnsmasq can't be started when the
> network is later restarted (because the original process is still
> listening on the ports that the new process wants).
> 
> The solution is, when the initial "use stat to check for identical
> inodes" check for identity between /proc/$pid/exe and $binpath fails,
> to check /proc/$pid/exe for a link ending with " (deleted)" and if so,
> truncate that part of the link and compare what's left with the
> original binarypath.
> 
> A twist to this problem is that on systems with "merged" /sbin and
> /usr/sbin (i.e. /sbin is really just a symlink to /usr/sbin; Fedora
> 17+ is an example of this), libvirt may have started the process using
> one path, but /proc/$pid/exe lists a different path (indeed, on F17
> this is the case - libvirtd uses /sbin/dnsmasq, but /proc/$pid/exe
> shows "/usr/sbin/dnsmasq"). The further bit of code to resolve this is
> to call virFileResolveAllLinks() on both the original binarypath and
> on the truncated link we read from /proc/$pid/exe, and compare the
> results.
> 
> The resulting code still succeeds in all the same cases it did before,
> but also succeeds if the binary was deleted or replaced after it was
> started.


ACK.

Thanks,
Kyle




More information about the libvir-list mailing list