[libvirt] [PATCH v5 08/11] qemu: Start PR daemon on domain startup

Michal Privoznik mprivozn at redhat.com
Fri May 11 07:55:46 UTC 2018


On 04/29/2018 07:01 PM, John Ferlan wrote:
> 
> 
> On 04/23/2018 09:14 AM, Michal Privoznik wrote:
>> Before we exec() qemu we have to spawn pr-helper processes for
>> all managed reservations (well, technically there can only one).
>> The only caveat there is that we should place the process into
>> the same namespace and cgroup as qemu (so that it shares the same
>> view of the system). But we can do that only after we've forked.
>> That means calling the setup function between fork() and exec().
>>
>> Again, despite demand to not store anything in status XML I have
>> to take my chances and store 'priv->prDaemonRunning' (at least)
>> in status XML. Otherwise we might forget whether we started
>> pr-helper after libvirtd restart.
> 
> And again airing review grievances in commit messages is really not
> something that should be done...
> 
> BTW: This "logic" perhaps answers concerns I had in patches 6 and 7
> regarding how to determine if something is running already and whether
> it's necessary to add to NS and cgroup or in the case of stopping, then
> removing from the cgroup... whether removing from NS works or not, I
> have no idea.
> 
>>
>> What happens if we change default managed alias or default socket
>> is undefined for now, because we don't store them in status XML.
> 
> Do you honestly believe this will happen in the future? The alias is a
> somewhat dynamic relationship between libvirt and qemu for the purpose
> of describing an object so that some future libvirt can find that object
> using the same alias. If something in the future changes such that there
> could be multiple objects, then we know we can query a running qemu for
> the current alias in order to retrieve the object.

Can we? I'm not sure we can do that, IOW I don't know how to do that.

> The future would then
> need to generate aliases based upon some new naming scheme and we can
> handle it at that time. For now, there is a one-to-one relationship to
> worry about. Likewise, for the socket path, if something changes in the
> future which then has multiple socket paths, then those paths would need
> to have something unique included.
> 
> If we ourselves decide to change the name of the alias, then we have to
> handle fallout. If the pr-helper socket path changes either by name or
> path to it, then we still know the old path. Whomever makes a change to
> the path would be required to handle all those current consumers for a
> possibly running domain.

Which is going to be ugly. We would have to try and guess aliases
whereas if we save it in the status XML we know exactly the alias we
need. Look at it from different angle - step away from pr-manager object
and think of generic device. We generate alias for it. And save it in
the status XML. Why? We can just generate the alias whenever we need it.
Also, we know all the changes made to alias generating code so we can
try all older versions until we find the right one. Or we can query for
it (assuming answer to my question above is yes).

> 
> If you really believe this can/will happen, then lets go back to patch 4
> and revisit the statement :
> 
> "For managed PR helper the alias is always "pr-helper0" and socket path
> "${vm->priv->libDir}/pr-helper0.sock"."
> 
> 
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>>  src/qemu/qemu_domain.c  |  22 +++++
>>  src/qemu/qemu_domain.h  |   2 +
>>  src/qemu/qemu_process.c | 232 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 256 insertions(+)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 5ccdeb8e3a..fbe6a0f332 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -2040,6 +2040,15 @@ qemuDomainObjPrivateXMLFormatAllowReboot(virBufferPtr buf,
>>  }
>>  
>>  
>> +static void
>> +qemuDomainObjPrivateXMLFormatPR(virBufferPtr buf,
>> +                                qemuDomainObjPrivatePtr priv)
>> +{
>> +    if (priv->prDaemonRunning)
>> +        virBufferAddLit(buf, "<prDaemon running='1'/>\n");
>> +}
> 
> Why not follow the existing @fakeReboot? or chardevStdioLogd?
> 
> Is there belief that at some point in the future 'running' could be 0 or
> 1?  Or could this be a tristate?  If there is, then it shouldn't be a
> boolean in the .h file.

Okay.

> 
>> +
>> +
>>  static int
>>  qemuDomainObjPrivateXMLFormatJob(virBufferPtr buf,
>>                                   virDomainObjPtr vm,
>> @@ -2180,6 +2189,8 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf,
>>  
>>      qemuDomainObjPrivateXMLFormatAllowReboot(buf, priv->allowReboot);
>>  
>> +    qemuDomainObjPrivateXMLFormatPR(buf, priv);
>> +
>>      if (qemuDomainObjPrivateXMLFormatBlockjobs(buf, vm) < 0)
>>          return -1;
>>  
>> @@ -2323,6 +2334,15 @@ qemuDomainObjPrivateXMLParseAllowReboot(xmlXPathContextPtr ctxt,
>>  }
>>  
>>  
>> +static void
>> +qemuDomainObjPrivateXMLParsePR(xmlXPathContextPtr ctxt,
>> +                               bool *prDaemonRunning)
>> +{
>> +    *prDaemonRunning = virXPathBoolean("count(./prDaemon[@running = '1']) > 0",
>> +                                       ctxt) > 0;
>> +}
>> +
>> +
>>  static int
>>  qemuDomainObjPrivateXMLParseJob(virDomainObjPtr vm,
>>                                  qemuDomainObjPrivatePtr priv,
>> @@ -2572,6 +2592,8 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt,
>>  
>>      qemuDomainObjPrivateXMLParseAllowReboot(ctxt, &priv->allowReboot);
>>  
>> +    qemuDomainObjPrivateXMLParsePR(ctxt, &priv->prDaemonRunning);
>> +
>>      if (qemuDomainObjPrivateXMLParseBlockjobs(priv, ctxt) < 0)
>>          goto error;
>>  
>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>> index 612d234113..b837902e8d 100644
>> --- a/src/qemu/qemu_domain.h
>> +++ b/src/qemu/qemu_domain.h
>> @@ -342,6 +342,8 @@ struct _qemuDomainObjPrivate {
>>      /* Migration capabilities. Rechecked on reconnect, not to be saved in
>>       * private XML. */
>>      virBitmapPtr migrationCaps;
>> +
> 
> Similar to the other recent additions, please add a comment to describe
> what the purpose is.
> 
>> +    bool prDaemonRunning;
>>  };
>>  
>>  # define QEMU_DOMAIN_PRIVATE(vm) \
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index 6a5262ae46..36cfa438ed 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -2555,6 +2555,231 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver,
>>  }
>>  
>>  
>> +static char *
>> +qemuProcessBuildPRHelperPidfilePath(virDomainObjPtr vm,
>> +                                    const char *prdAlias)
>> +{
>> +    qemuDomainObjPrivatePtr priv = vm->privateData;
>> +
>> +    return virPidFileBuildPath(priv->libDir, prdAlias);
>> +}
>> +
>> +
>> +static void
>> +qemuProcessKillPRDaemon(virDomainObjPtr vm)
>> +{
>> +    qemuDomainObjPrivatePtr priv = vm->privateData;
>> +    virErrorPtr orig_err;
>> +    const char *prdAlias;
>> +    char *pidfile;
>> +
>> +    if (!priv->prDaemonRunning)
>> +        return;
>> +
>> +    if (!(prdAlias = qemuDomainGetManagedPRAlias())) {
>> +        VIR_WARN("Unable to get default PR manager alias");
>> +        return;
>> +    }
> 
> This makes zero sense as it cannot be NULL... Much easier to replace
> prdAlias with QEMU_DOMAIN_MANAGED_PR_ALIAS
> 
>> +
>> +    if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias))) {
>> +        VIR_WARN("Unable to construct pr-helper pidfile path");
>> +        return;
>> +    }
>> +
>> +    virErrorPreserveLast(&orig_err);
>> +    if (virPidFileForceCleanupPath(pidfile) < 0) {
>> +        VIR_WARN("Unable to kill pr-helper process");
>> +    } else {
>> +        if (unlink(pidfile) < 0 &&
>> +            errno != ENOENT) {
>> +            virReportSystemError(errno,
>> +                                 _("Unable to remove stale pidfile %s"),
>> +                                 pidfile);
>> +        } else {
>> +            priv->prDaemonRunning = false;
>> +        }
>> +    }
>> +    virErrorRestore(&orig_err);
>> +
>> +    VIR_FREE(pidfile);
>> +}
>> +
>> +
>> +static int
>> +qemuProcessStartPRDaemonHook(void *opaque)
>> +{
>> +    virDomainObjPtr vm = opaque;
>> +    size_t i, nfds = 0;
>> +    int *fds = NULL;
>> +    int ret = -1;
>> +
>> +    if (virProcessGetNamespaces(vm->pid, &nfds, &fds) < 0)
>> +        return ret;
>> +
>> +    if (nfds > 0 &&
>> +        virProcessSetNamespaces(nfds, fds) < 0)
>> +        goto cleanup;
>> +
> 
> 
> hmmmm... so when we hot plug if not already running, then this would be
> run, right?  So this sets up NS appropriately and thus the cgroup magic
> as part of that...  So not "obvious" from prior to patches...  Maybe
> it's just an ordering and thought thing, but I do feel better now about
> the previous 2 patches and whether they can be run multiple times or not
> as the case seems to be.
> 
>> +    ret = 0;
>> + cleanup:
>> +    for (i = 0; i < nfds; i++)
>> +        VIR_FORCE_CLOSE(fds[i]);
>> +    VIR_FREE(fds);
>> +    return ret;
>> +}
>> +
>> +
>> +static int
>> +qemuProcessStartPRDaemon(virDomainObjPtr vm,
>> +                         const virDomainDiskDef *disk)
>> +{
>> +    qemuDomainObjPrivatePtr priv = vm->privateData;
>> +    virQEMUDriverPtr driver = priv->driver;
>> +    virQEMUDriverConfigPtr cfg;
>> +    const char *prdAlias;
>> +    int errfd = -1;
>> +    char *pidfile = NULL;
>> +    int pidfd = -1;
>> +    char *socketPath = NULL;
>> +    pid_t cpid = -1;
>> +    virCommandPtr cmd = NULL;
>> +    virTimeBackOffVar timebackoff;
>> +    const unsigned long long timeout = 500000; /* ms */
>> +    int ret = -1;
>> +
>> +    if (!virStoragePRDefIsManaged(disk->src->pr) ||
>> +        priv->prDaemonRunning)
>> +        return 0;
>> +
>> +    cfg = virQEMUDriverGetConfig(driver);
>> +
>> +    if (!virFileIsExecutable(cfg->prHelperName)) {
>> +        virReportSystemError(errno, _("'%s' is not a suitable pr helper"),
>> +                             cfg->prHelperName);
>> +        goto cleanup;
>> +    }
>> +
>> +    prdAlias = qemuDomainGetManagedPRAlias();
>> +
>> +    if (!(pidfile = qemuProcessBuildPRHelperPidfilePath(vm, prdAlias)))
>> +        goto cleanup;
>> +
>> +    /* Just try to acquire. Dummy pid will be replaced later */
>> +    if ((pidfd = virPidFileAcquirePath(pidfile, false, -1)) < 0)
>> +        goto cleanup;
>> +
>> +    if (!(socketPath = qemuDomainGetPRSocketPath(vm, disk->src->pr)))
>> +        goto cleanup;
>> +
>> +    /* Remove stale socket */
>> +    if (unlink(socketPath) < 0 &&
>> +        errno != ENOENT) {
>> +        virReportSystemError(errno,
>> +                             _("Unable to remove stale socket path: %s"),
>> +                             socketPath);
>> +        goto cleanup;
>> +    }
>> +
>> +    if (!(cmd = virCommandNewArgList(cfg->prHelperName,
>> +                                     "-k", socketPath,
>> +                                     "-f", pidfile,
>> +                                     NULL)))
>> +        goto cleanup;
>> +
>> +    virCommandDaemonize(cmd);
>> +    /* We want our virCommand to write child PID into the pidfile
>> +     * so that we can read it even before exec(). */
>> +    virCommandSetPidFile(cmd, pidfile);
>> +    virCommandSetErrorFD(cmd, &errfd);
>> +
>> +    /* Place the process into the same namespace and cgroup as
>> +     * qemu (so that it shares the same view of the system). */
>> +    virCommandSetPreExecHook(cmd, qemuProcessStartPRDaemonHook, vm);
>> +
>> +    if (virCommandRun(cmd, NULL) < 0)
>> +        goto cleanup;
>> +
>> +    if (virPidFileReadPath(pidfile, &cpid) < 0) {
>> +        virReportError(VIR_ERR_INTERNAL_ERROR,
>> +                       _("pr helper %s didn't show up"), prdAlias);
> 
> Honestly the alias output has nothing to do with the message - seems
> like it's here to prove a point.
> 
> If anything what's more important here is 'cfg->prHelperName' because
> that's what died not the alias.
> 
>> +        goto cleanup;
>> +    }
>> +
>> +    if (virTimeBackOffStart(&timebackoff, 1, timeout) < 0)
>> +        goto cleanup;
>> +    while (virTimeBackOffWait(&timebackoff)) {
>> +        char errbuf[1024] = { 0 };
>> +
>> +        if (virFileExists(socketPath))
>> +            break;
>> +
>> +        if (virProcessKill(cpid, 0) == 0)
>> +            continue;
>> +
>> +        if (saferead(errfd, errbuf, sizeof(errbuf) - 1) < 0) {
>> +            virReportSystemError(errno,
>> +                                 _("pr helper %s died unexpectedly"), prdAlias);
> 
> ditto
> 
>> +        } else {
>> +            virReportError(VIR_ERR_OPERATION_FAILED,
>> +                           _("pr helper died and reported: %s"), errbuf);
>> +        }
>> +        goto cleanup;
>> +    }
>> +
>> +    if (!virFileExists(socketPath)) {
>> +        virReportError(VIR_ERR_OPERATION_TIMEOUT, "%s",
>> +                       _("pr helper socked did not show up"));
>> +        goto cleanup;
>> +    }
>> +
>> +    if (priv->cgroup &&
>> +        virCgroupAddMachineTask(priv->cgroup, cpid) < 0)
>> +        goto cleanup;
>> +
>> +    if (qemuSecurityDomainSetPathLabel(driver->securityManager,
>> +                                       vm->def, socketPath, true) < 0)
>> +        goto cleanup;
>> +
>> +    priv->prDaemonRunning = true;
>> +    ret = 1;
>> + cleanup:
>> +    if (ret < 0) {
>> +        virCommandAbort(cmd);
>> +        if (cpid >= 0)
>> +            virProcessKillPainfully(cpid, true);
>> +        if (pidfile)
>> +            unlink(pidfile);
>> +    }
>> +    virCommandFree(cmd);
>> +    VIR_FREE(socketPath);
>> +    VIR_FORCE_CLOSE(pidfd);
>> +    VIR_FREE(pidfile);
>> +    VIR_FORCE_CLOSE(errfd);
>> +    virObjectUnref(cfg);
>> +    return ret;
>> +}
>> +
>> +
>> +static int
>> +qemuProcessMaybeStartPRDaemon(virDomainObjPtr vm)
>> +{
>> +    size_t i;
>> +    int rv;
>> +
>> +    for (i = 0; i < vm->def->ndisks; i++) {
>> +        const virDomainDiskDef *disk = vm->def->disks[i];
>> +
>> +        if ((rv = qemuProcessStartPRDaemon(vm, disk)) < 0)
>> +            return -1;
>> +
>> +        if (rv > 0)
>> +            return 1;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +
>>  static int
>>  qemuProcessInitPasswords(virQEMUDriverPtr driver,
>>                           virDomainObjPtr vm,
>> @@ -6071,6 +6296,10 @@ qemuProcessLaunch(virConnectPtr conn,
>>      if (qemuProcessResctrlCreate(driver, vm) < 0)
>>          goto cleanup;
>>  
>> +    VIR_DEBUG("Setting up PR daemon");
>> +    if (qemuProcessMaybeStartPRDaemon(vm) < 0)
>> +        goto cleanup;
>> +
>>      VIR_DEBUG("Setting domain security labels");
>>      if (qemuSecuritySetAllLabel(driver,
>>                                  vm,
>> @@ -6598,6 +6827,9 @@ void qemuProcessStop(virQEMUDriverPtr driver,
>>      /* Remove the master key */
>>      qemuDomainMasterKeyRemove(priv);
>>  
>> +    /* Do this before we delete the tree and remove pidfile. */
>> +    qemuProcessKillPRDaemon(vm);
>> +
> 
> I find it ironic that this "ProcessMaybeStart", but "ProcessKill" here.
> IDC because the logic is fine and it's just a naming thing
> 
> If you change to using boolean or at least successfully argue for
> keeping the "running = '1'" in the XML and with the assumption that you
> have already gone with the suggested alias alterations and thus apply
> the attached patch,
> 
> Reviewed-by: John Ferlan <jferlan at redhat.com>

I've done all the changes.

Thanks,
Michal




More information about the libvir-list mailing list