[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Performance of libauparse

John Dennis wrote:
I also agree the data stream which emerges from audit is rather difficult to work with. Eric likes to point out we can't change the kernel, so maybe what we really need (and has been proposed) is for auditd to reformat the data before emitting it or writing it do disk (e.g. assemble records into events, decode strings which have been hexified, etc.) Currently auparse is responsible for much of this as part of a post processing step which has to be repeated every time audit data is read instead of just once as it emerges from the kernel. If instead the auparse user level code was folded into auditd which then became responsible for formatting the ad hoc data received from the kernel the final output from audit could be much more friendly and much of the rationale for auparse would evaporate.

I was going to request going the other way with libauparse, i.e. to entirely separate it from auditd. As I mentioned, I'm not using auditd because it wasn't really written with my customer's requirements in mind (high volume, no local storage). My audit daemon needs to run on RHEL 3 (it has a LAuS backend too) and RHEL 4. I don't see anything architecturally which ties libauparse to auditd, so if it was a separate library I could recompile it for RHEL 4 without replacing the RHEL 4 audit-libs, etc. I can certainly see the efficiency in auditd parsing data before handing it off to dispatchers, but it's not hard to construct non-auditd uses for it either. Of course, it would need some performance work first for my use case, but I wouldn't want to duplicate the effort unnecessarily.

On the more general topic of the format of data emitted by the kernel, I see 2 serious threads of problem presented by the above, and by the current solution (even though they are currently the most pragmatic):

1. libauparse only exists to reverse engineer a really bad protocol.
2. The existing protocol has already broken userspace many times.

On that second point, the changes since the protocol was introduced (pre-git history, so I can't work out when) have been such that any tool written at the time of 2.6.12 couldn't possibly expect to continue to function correctly if you updated the kernel underneath it. Some examples:

"audit_rate_limit=%d old=%d by auid %u" -> "audit_rate_limit=%d old=%d by auid=%u"

Add escaping to comm field

Add tty field without quotes or escaping of value

Remove qbytes field from IPC record
Change iuid, igid field names

Convert some hex IPC records to octal

Change to format of EXECVE messages

Auditd only continues to function because it has been updated in step with the kernel: it is 'special'. Upstream's opinion on this is fairly clear. Note this isn't an argument in favour of a binary format specifically (although I favour that for efficiency), but it does highlight the requirement for a new, well-designed format.

Matthew Booth, RHCA, RHCSS
Red Hat, Global Professional Services

M:       +44 (0)7977 267231
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]