[libvirt] [PATCH 01/11] Block SIGPIPE around virExec hook functions
Daniel P. Berrange
berrange at redhat.com
Fri Jan 28 16:45:14 UTC 2011
On Fri, Jan 28, 2011 at 09:32:42AM -0700, Eric Blake wrote:
> On 01/24/2011 08:13 AM, Daniel P. Berrange wrote:
> > Some functionality run in virExec hooks may do I/O which
> > can trigger SIGPIPE. Renable SIGPIPE blocking around the
> > hook function
>
> > * src/util/util.c: Block SIGPIPE around hooks
> > - if (hook)
> > + if (hook) {
> > + /* virFork reset all signal handlers to the defaults.
> > + * This is good for the child process, but our hook
> > + * risks running something that generates SIGPIPE,
> > + * so we need to temporarily block that again
> > + */
> > + struct sigaction waxon, waxoff;
>
> Cute.
>
> > + waxoff.sa_handler = SIG_IGN;
> > + waxoff.sa_flags = 0;
> > + memset(&waxon, 0, sizeof(waxon));
> > + if (sigaction(SIGPIPE, &waxoff, &waxon) < 0) {
> > + virReportSystemError(errno, "%s",
> > + _("Could not disable SIGPIPE"));
>
> Yikes. We have a potential deadlock problem. See this bug report
> against GNU sort:
>
> http://lists.gnu.org/archive/html/coreutils/2011-01/msg00085.html
>
> In sort, any program that mixes pthread_create with fork (and libvirt
> falls into that category) must obey this section of POSIX:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html
>
> "If a multi-threaded process calls fork(), the new process shall contain
> a replica of the calling thread and its entire address space, possibly
> including the states of mutexes and other resources. Consequently, to
> avoid errors, the child process may only execute async-signal-safe
> operations until such time as one of the exec functions is called."
>
> malloc() (which is called by virReportSystemError(), as well as by _())
> is NOT async-signal-safe; therefore, it is quite possible that we fork()
> in one thread while another thread is in the middle of holding the
> malloc() mutex, and the child process will deadlock because it no longer
> has a secondary thread available to release the malloc() mutex.
>
> Ultimately, we need to refactor and audit the code so that only
> async-signal-safe functions are allowed between fork() and exec(); which
> means that virExec needs to be taught how to hand all errors back to the
> parent over a secondary pipe for the parent to issue (rather than the
> child attempting to issue any errors on its own).
>
> However, that problem is pre-existing; so your patch, while adding
> another instance of a violation, is not adding a regression, so:
Hmm, we have had this style of problem before, but with libvirt
internal APIs. eg, this is why we do virLogLock() and Unlock
across the fork() call.
I didn't occur to me that we could get hit at the POSIX level
with this. This is a collosal PITA because we do quite alot
of work inbetween fork+exec() in QEMU, including calling out
to library APIs we don't control to the extent that I doubt
we can practically audit it, or easily address it :-(
Daniel
More information about the libvir-list
mailing list