[PATCH 4.10 070/111] audit: fix auditd/kernel connection state tracking

Ingo Molnar mingo at kernel.org
Wed Feb 21 08:46:02 UTC 2018


* Paul Moore <paul at paul-moore.com> wrote:

> On Tue, Feb 20, 2018 at 10:18 AM, Peter Zijlstra <peterz at infradead.org> wrote:
> > On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
> >> On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra <peterz at infradead.org> wrote:
> >
> >> > It's not at all clear to me what that code does, I just stumbled upon
> >> > __mutex_owner() outside of the mutex code itself and went WTF.
> >>
> >> If you don't want people to use __mutex_owner() outside of the mutex
> >> code I might suggest adding a rather serious comment at the top of the
> >> function, because right now I don't see anything suggesting that
> >> function shouldn't be used.  Yes, there is the double underscore
> >> prefix, but that can mean a few different things these days.
> >
> > Find below.
> >
> >> > The comment (aside from having the most horribly style) ...
> >>
> >> Yeah, your dog is ugly too.  Notice how neither comment is constructive?
> >
> > I'm sure you've seen this one:
> >
> >   https://lkml.org/lkml/2016/7/8/625
> 
> Yep.  I stand behind my earlier comment in this thread.
> 
> >> > Maybe if you could explain how that code is supposed to work and why it
> >> > doesn't know if it holds a lock I could make a suggestion...
> >>
> >> I just spent a few minutes looking back over the bits available in
> >> include/linux/mutex.h and I'm not seeing anything beyond
> >> __mutex_owner() which would allow us to determine the mutex owning
> >> task.  It's probably easiest for us to just track ownership ourselves.
> >> I'll put together a patch later today.
> >
> > Note that up until recently the mutex implementation didn't even have a
> > consistent owner field. And the thing is, it's very easy to use wrong,
> > only today I've seen a patch do: "__mutex_owner() == task", where task
> > was allowed to be !current, which is just wrong.
> 
> Arguably all the more reason why a strongly worded warning is
> important (which I see you've included below, feel free to include my
> Reviewed-by).
> 
> > Looking through kernel/audit.c I'm not even sure I see how you would end
> > up in audit_log_start() with audit_cmd_mutex held.
> >
> > Can you give me a few code paths that trigger this? Simple git-grep is
> > failing me.
> 
> Basically look at the code in audit_receive_msg(), but I wasn't asking
> your opinion on how we should rewrite the audit subsystem, I was just
> asking how one could determine if the current task was holding a given
> mutex in a way that was acceptable to you.  Based on your comments,
> and some further inspection of the mutex code, it appears that is/was
> not something that the core mutex code wants to support/make-visible.
> Which is perfectly fine, I just wanted to make sure I wasn't missing
> something before I went ahead and wrote a wrapper around the mutex
> code for use by audit.
> 
> FWIW, I just put together the following patch which removes the
> __mutex_owner() call from audit and doesn't appear to break anything
> on the audit side (you're CC'd on the patch).  It has only been
> lightly tested, but I'm going to bang on it for a day or so and if I
> hear no objections I'll merge it into audit/next.
>
> * https://www.redhat.com/archives/linux-audit/2018-February/msg00066.html

Could you please explain the audit_ctl_lock()/unlock() primitive you are 
introducing there? You seem to be implementing some sort of recursive locking 
primitive, but in a strange way.

AFAICS the primary problem appears to be this code path:

  audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET -> audit_log_common_recv_msg() -> audit_log_start()

where we can arrive already holding the lock.

I.e. recursive mutex, kinda.

What's the thinking there? Neither the changelog nor the code explains this.

Thanks,

	Ingo




More information about the Linux-audit mailing list