Preferred subj= with multiple LSMs

Fri Jul 19 21:21:58 UTC 2019

On Wed, Jul 17, 2019 at 7:02 PM Casey Schaufler <casey at schaufler-ca.com> wrote:
> On 7/17/2019 9:23 AM, Paul Moore wrote:
> > On Wed, Jul 17, 2019 at 11:49 AM Casey Schaufler <casey at schaufler-ca.com> wrote:
> >> On 7/17/2019 5:14 AM, Paul Moore wrote:
> >>> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey at schaufler-ca.com> wrote:
> >>>> On 7/16/2019 4:13 PM, Paul Moore wrote:
> >>>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey at schaufler-ca.com> wrote:
> >>>>>> It sounds as if some variant of the Hideous format:
> >>>>>>
> >>>>>>         subj=selinux='a:b:c:d',apparmor='z'
> >>>>>>         subj=selinux/a:b:c:d/apparmor/z
> >>>>>>         subj=(selinux)a:b:c:d/(apparmor)z
> >>>>>>
> >>>>>> would meet Steve's searchability requirements, but with significant
> >>>>>> parsing performance penalties.
> >>>>> I think "hideous format" sums it up nicely.  Whatever we choose here
> >>>>> we are likely going to be stuck with for some time and I'm near to
> >>>>> 100% that multiplexing the labels onto a single field is going to be a
> >>>>> disaster.
> >>>> If the requirement is that subj= be searchable I don't see much of
> >>>> an alternative to a Hideous format. If we can get past that, and say
> >>>> that all subj_* have to be searchable we can avoid that set of issues.
> >>>> Instead of:
> >>>>
> >>>>         s = strstr(source, "subj=")
> >>>>         search_after_subj(s, ...);
> >>> This example does a lot of hand waving in search_after_subj(...)
> >>> regarding parsing the multiplexed LSM label.  Unless we restrict the
> >>> LSM label formats (which seems both wrong, and too late IMHO)
> >> I don't think it's too late, and I think it would be healthy
> >> to restrict LSM "contexts" to character sets that make command
> >> line specification possible. Embedded newlines? Ewwww.
> > That would imply that the delimiter you would choose for the
> > multiplexed approach would be something odd (I think you suggested
> > 0x02, or similar, earlier) which would likely require the multiplexed
> > subj field to become a hex encoded field which would be very
> > unfortunate in my opinion and would technically break with the current
> > subj/obj field format spec.  Picking a normal-ish delimiter, and
> > restricting its use by LSMs seems wrong to me.
>
> Just say "no" to hex encoding!

Yes, it's best avoided.

> BTW, keys are not hex encoded.

The kernel keyring keys?  Not really relevant here I don't think.

> We've never had to think about having general rules on
> what security modules do before, because with only one
> active each could do whatever it wanted without fear of
> conflict. If there is already a character that none of
> the existing modules use, how would it be wrong to
> reserve it?

"We've never had to think about having general rules on what security
modules do before..."

We famously haven't imposed restrictions on the label format before
now, and this seems like a pretty poor reason to start.

> > It's important to remember that Steve's strstr() comment only reflects
> > his set of userspace tools.  When you start talking about log
> > aggregation and analytics, it seems very likely that there are other
> > tools in use, likely with their own parsers that do much more
> > complicated searches than a simple strstr() call.
>
> Point. But long term, they'll have to be updated to accommodate
> whatever we decide on. Which makes the "simple" case, where one
> security module is in use all the more important.

Both the multiplexed and subj_X proposals handle the single major LSM
case the same: identical to what we have now.  Regardless of how
important the single major LSM case may be, it isn't a distinguishing
factor in this discussion.

> >>>> we have
> >>>>
> >>>>         s = source
> >>>>         for (i = 0; i < lsm_slots ; i++) {
> >>>>                 s = strstr(s, "subj_")
> >>>>                 if (!s)
> >>>>                         break;
> >>>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
> >>> The hand waving here in search_after_subj_(...) is much less;
> >>> essentially you just match "subj_X" and then you can take the field
> >>> value as the LSM's label without having to know the format, the policy
> >>> loaded, etc.  It is both safer and doesn't require knowledge of the
> >>> LSMs (the LSM "name" can be specified as a parameter to the search
> >>> tool).
> >> You can do that with the Hideous format as well. I wouldn't
> >> say which would be easier without delving into the audit user
> >> space.
>
> > No, you can't.  You still need to parse the multiplexed mess, that's
> > the problem.
>
> You move the parsing problem to the record, where you have to
> look for subj_selinux= instead of having the parsing problem in
> the subj= field, where you look for something like selinux=
> within the field. Neither looks like the work of an afternoon to
> get right.

Finding subj_X in an audit record is no different than finding any
other field in a record.  Parsing the multiplexed label mess is a
whole different problem and prone to lots of mistakes.

> It probably looks like I'm arguing for the Hideous format option.
> That would require less work and code disruption, so it is tempting
> to push for it. But I would have to know the user space side a
> whole lot better than I do to feel good about pushing anything that
> isn't obviously a good choice. I kind of prefer Paul's "subj=?"
> approach, but as it's harder, I don't want to spend too much time
> on it if it gets me a big, juicy, well deserved NAK.

I didn't want to have to NAK this, but if that is what it is going to
take, so be it ... as it currently stands I'm NAK'ing the the
multiplexed approach.  You don't have to go with the subj_X approach,
but the multiplexed approach is a terrible idea and I can almost
guarantee that we would be regretting that choice in a few years time.

-- 
paul moore
www.paul-moore.com