[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH v2 16/17] vircgroupv2: use dummy process to workaround kernel bug with systemd



On Thu, Jan 24, 2019 at 11:06:51AM +0000, Daniel P. Berrangé wrote:
> On Mon, Jan 14, 2019 at 04:47:50PM +0100, Pavel Hrdina wrote:
> > If some program is attached to cgroup and that cgroup is removed before
> > detaching the program it might not be freed and will remain in the
> > system until it's rebooted.
> > 
> > This would not be that big deal to workaround if we wouldn't use
> > machined to track our guests.  Systemd tries to be the nice guy and if
> > there is no process attached to TransientUnit it will destroy the unit
> > and remove the cgroup from system which will hit the kernel bug.
> > 
> > In order to prevent this behavior of systemd we can move another process
> > into the guest cgroup and the cgroup will remain active even if we kill
> > qemu process while destroying guest.  We can then detach our BPF program
> > from that cgroup and call machined terminate API to cleanup the cgroup.
> 
> The kernel bugzilla says that "bpftool" still reports existance of
> the BPF program when this situation hits. So can't we just use
> bpftool to explicitly delete this orphaned program

Unfortunately no, bpftool doesn't have any way how to delete any
programs.

> > +static pid_t
> > +virCgroupV2DevicesStartDummyProc(virCgroupPtr group)
> > +{
> > +    pid_t ret = -1;
> > +    pid_t pid = -1;
> > +    virCommandPtr cmd = NULL;
> > +    VIR_AUTOFREE(char *) pidfile = NULL;
> > +    VIR_AUTOFREE(char *) fileName = NULL;
> > +
> > +    if (virAsprintf(&fileName, "cgroup-%s", group->vmName) < 0)
> > +        return -1;
> > +
> > +    pidfile = virPidFileBuildPath(group->stateDir, fileName);
> > +    if (!pidfile)
> > +        return -1;
> > +
> > +    cmd = virCommandNewArgList("/usr/bin/tail", "-f", "/dev/null",
> > +                               "-s", "3600", NULL);
> > +    if (!cmd)
> > +        return -1;
> > +
> > +    virCommandDaemonize(cmd);
> > +    virCommandSetPidFile(cmd, pidfile);
> > +
> > +    if (virCommandRun(cmd, NULL) < 0)
> > +        goto cleanup;
> > +
> > +    if (virPidFileReadPath(pidfile, &pid) < 0) {
> > +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> > +                       _("failed to start dummy cgroup helper"));
> > +        goto cleanup;
> > +    }
> > +
> > +    if (virCgroupAddMachineProcess(group, pid) < 0)
> > +        goto cleanup;
> > +
> > +    ret = pid;
> > + cleanup:
> > +    if (ret < 0) {
> > +        virCommandAbort(cmd);
> > +        if (pid > 0)
> > +            virProcessKillPainfully(pid, true);
> > +    }
> > +    if (pidfile)
> > +        unlink(pidfile);
> > +    virCommandFree(cmd);
> > +    return ret;
> > +}
> 
> I think this is really horrible to be honest.
> 
> If there's really no other option, then I'd prefer that we do not use
> BPF at all on known broken kernel versions until it is fixed.

I agree with you and Michal, it's ugly and most likely the only
solution with broken kernel and systemd.  I wrote the patch to show
how we could workaround the kernel bug and I was hoping that we will
agree on not using it.

Another possible workaround would be to not use machined and create the
systemd unit by other APIs, that way systemd would not remove the cgroup
and we would be able to detach programs before removing cgroups.

Or just simply wait until kernel fixes the issue.

Pavel

Attachment: signature.asc
Description: PGP signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]