[libvirt] [PATCH] LXC: create a bind mount for sysfs when enable userns but disable netns

Chen, Hanxiao chenhanxiao at cn.fujitsu.com
Wed Mar 11 02:30:46 UTC 2015



> -----Original Message-----
> From: Richard Weinberger [mailto:richard.weinberger at gmail.com]
> Sent: Wednesday, March 11, 2015 4:56 AM
> To: Chen, Hanxiao/陈 晗霄
> Cc: libvir-list at redhat.com; Daniel P. Berrange; Gao feng
> Subject: Re: [libvirt] [PATCH] LXC: create a bind mount for sysfs when enable userns
> but disable netns
> 
> On Mon, Jul 14, 2014 at 12:01 PM, Chen Hanxiao
> <chenhanxiao at cn.fujitsu.com> wrote:
> > kernel commit 7dc5dbc879bd0779924b5132a48b731a0bc04a1e
> > forbid us doing a fresh mount for sysfs
> > when enable userns but disable netns.
> > This patch will create a bind mount in this senario.
> 
> Sorry for exhuming an already merged patch but today I ran into a
> nasty issue caused by it.
> 
> > Signed-off-by: Chen Hanxiao <chenhanxiao at cn.fujitsu.com>
> > ---
> >  src/lxc/lxc_container.c | 44 +++++++++++++++++++++++++++++++++-----------
> >  1 file changed, 33 insertions(+), 11 deletions(-)
> >
> > diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
> > index 4d89677..8a27215 100644
> > --- a/src/lxc/lxc_container.c
> > +++ b/src/lxc/lxc_container.c
> > @@ -815,10 +815,13 @@ static int lxcContainerSetReadOnly(void)
> >  }
> >
> >
> > -static int lxcContainerMountBasicFS(bool userns_enabled)
> > +static int lxcContainerMountBasicFS(bool userns_enabled,
> > +                                    bool netns_disabled)
> >  {
> >      size_t i;
> >      int rc = -1;
> > +    char* mnt_src = NULL;
> > +    int mnt_mflags;
> >
> >      VIR_DEBUG("Mounting basic filesystems");
> >
> > @@ -826,8 +829,25 @@ static int lxcContainerMountBasicFS(bool userns_enabled)
> >          bool bindOverReadonly;
> >          virLXCBasicMountInfo const *mnt = &lxcBasicMounts[i];
> >
> > +        /* When enable userns but disable netns, kernel will
> > +         * forbid us doing a new fresh mount for sysfs.
> > +         * So we had to do a bind mount for sysfs instead.
> > +         */
> > +        if (userns_enabled && netns_disabled &&
> > +            STREQ(mnt->src, "sysfs")) {
> > +            if (VIR_STRDUP(mnt_src, "/sys") < 0) {
> > +                goto cleanup;
> > +            }
> 
> This is clearly broken and looks very untested to me.
> 
It's broken now.
But when I submitted this patch last year, it's not.

> It will issue this mount call:
> mount("/sys", "/sys", "sysfs", MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_BIND, NULL)
> because the code runs after pivot_root(2).
> i.e, /sys will be still empty after that and no sysfs at all there.
> As libvirt will later remount /sys readonly creating a container will
> fail with the most useless error message:
> Error: internal error: guest failed to start: Unable to create
> directory /sys/fs/: Read-only file system
> or
> Error: internal error: guest failed to start: Unable to create
> directory /sys/fs/cgroup: Read-only file system
> 
> Please note that changing "/sys" to "/.oldroot/sys" will not solve the
> issue as this code runs already in the new
> namespace and therefore the old mount tree is locked, thus MS_BIND is
> not allowed.
> 
> This brings me to the question, why do you handle the netns_disabled
> case anyway?

Please check the discussion at:
http://lists.linux-foundation.org/pipermail/containers/2014-July/034721.html

> If in the XML file no network is specified just create a new and empty
> network namespace.
> Bindmounting /sys into the container is a security issue. This is why
> mounting sysfs without a netns
> was disabled to begin with.

Yes, I tried to propose enable netns by default,
but Dan thought that we should allow containers sharing the host's network:
http://www.redhat.com/archives/libvir-list/2013-August/msg01025.html

So we should allow user create containers without netns,
they should know what they do if they read libvirt's docs
See docs patch describe security considerations:
http://www.redhat.com/archives/libvir-list/2013-September/msg00562.html

Regards,
- Chen
> 
> P.S: Sorry for the grumpy mail, I've wasted almost the whole day with
> debugging that issue.
> 
> --
> Thanks,
> //richard




More information about the libvir-list mailing list