Preferred subj= with multiple LSMs

Wed Jul 17 15:49:15 UTC 2019

On 7/17/2019 5:14 AM, Paul Moore wrote:
> On Tue, Jul 16, 2019 at 7:47 PM Casey Schaufler <casey at schaufler-ca.com> wrote:
>> On 7/16/2019 4:13 PM, Paul Moore wrote:
>>> On Tue, Jul 16, 2019 at 6:18 PM Casey Schaufler <casey at schaufler-ca.com> wrote:
>>>> It sounds as if some variant of the Hideous format:
>>>>
>>>>         subj=selinux='a:b:c:d',apparmor='z'
>>>>         subj=selinux/a:b:c:d/apparmor/z
>>>>         subj=(selinux)a:b:c:d/(apparmor)z
>>>>
>>>> would meet Steve's searchability requirements, but with significant
>>>> parsing performance penalties.
>>> I think "hideous format" sums it up nicely.  Whatever we choose here
>>> we are likely going to be stuck with for some time and I'm near to
>>> 100% that multiplexing the labels onto a single field is going to be a
>>> disaster.
>> If the requirement is that subj= be searchable I don't see much of
>> an alternative to a Hideous format. If we can get past that, and say
>> that all subj_* have to be searchable we can avoid that set of issues.
>> Instead of:
>>
>>         s = strstr(source, "subj=")
>>         search_after_subj(s, ...);
> This example does a lot of hand waving in search_after_subj(...)
> regarding parsing the multiplexed LSM label.  Unless we restrict the
> LSM label formats (which seems both wrong, and too late IMHO)

I don't think it's too late, and I think it would be healthy
to restrict LSM "contexts" to character sets that make command
line specification possible. Embedded newlines? Ewwww.

>  we have
> a parsing nightmare; can you write a safe multiplexed LSM label parser
> without knowledge of each LSM label format?  Can you do that for each
> LSM without knowing their loaded policy?  What happens when the policy
> and/or label format changes?  What happens in a few years when another
> LSM is added to the kernel?

I was intentionally hand-wavy because of those very issues.
Steve says that parsing is limited to "strstr()", so looking for
":s7:" in the subject should work just as well with a Hideous
format as it does today, with the exception of false positives
where LSMs have label string overlaps.

Where is the need to use a module specific label parser coming
from? Does the audit code parse SELinux contexts now? 

>> we have
>>
>>         s = source
>>         for (i = 0; i < lsm_slots ; i++) {
>>                 s = strstr(s, "subj_")
>>                 if (!s)
>>                         break;
>>                 s = search_after_subj_(s, lsm_slot_name[i], ...)
> The hand waving here in search_after_subj_(...) is much less;
> essentially you just match "subj_X" and then you can take the field
> value as the LSM's label without having to know the format, the policy
> loaded, etc.  It is both safer and doesn't require knowledge of the
> LSMs (the LSM "name" can be specified as a parameter to the search
> tool).

You can do that with the Hideous format as well. I wouldn't
say which would be easier without delving into the audit user
space.

>> There's enough ugly to go around either way.
>> And I'm not partial to either approach, but do would very
>> much like to get the code done so I can get on to the next
>> set of amazing challenges.
>>
>> Oh, and I don't want to pick on subj= as obj= has the exact same issues.
> Yes, I stopped talking about both subj and obj some time ago in this
> thread because I figure we can use the same approach for both.
>