Audit for Filesystem monitoring tool

Tue Oct 5 15:27:37 UTC 2010

Hello Everyone -

(Steve, if you're around I'd love to get your opinion on this.)

I'm working on a file system monitoring product for a client and am trying to evaluate whether using the Linux audit subsystem as a
source for file system change information is a viable option.

I have a current working implementation that uses an LKM to hook the syscall table and intercept calls to things like creat(),
open(), unlink, rename(), mkdir(), rmdir(), and worst of all write().  This implementation actually works pretty well, but there are
some political issues with hooking the system call table (as I'm sure I'll get yelled at for doing so here) as well as some
technical ones.  In more recent versions of the Linux kernel (>2.6.24 x64 specifically) they've managed to lock the system call
table (as far as I can tell) in memory and any attempts to write to it after the kernel boots are thwarted.  It's a good security
measure against rootkits for sure, but renders some legitimate applications like on-access virus scanners and my customer's product
worthless without finding another implementation choice.

I've looked fairly extensively at Linux Audit, played with it a bit, and have some questions.  If anyone has information or opinions
on these, I'd be most appreciative if you'd share them.

1) Data Format - I wrote a simple audit dispatcher plugin to get a look at the data stream.  With "string" format set in the
configuration, it appears as if the data one formatted line of string data just like you see in the normal audit log.  With "binary"
format set, does it use the audit_dispatcher_header structure followed by the event data?  If so, what format is the data and how do
I extract values from it?  (could be faster than parsing the strings from the string version).

2) Watches - At first glance watches look like a good way to do this, except for one big problem.  I need to monitor arbitrarily
large sections of the file system without necessarily knowing the paths to specific objects up front.  I suppose I could recursively
enumerate a parent directory, insert watches on every child directory and all files within those directories, but it seems to me
that could take a really long time to do on huge filesystems, and potentially consume huge amounts of kernel memory (not sure how
the watches are implemented).

3) Filtering - In order to accomplish what I am trying to do, I need to audit a large number of syscalls.  Obviously there would be
a significant performance impact on the system by doing so.  Even if the customer was willing to live with that impact, there is an
additional problem with the fact that adding all of those syscall audit rules means that they will end up being logged to the log
file by auditd (if that logging is turned on) and dispatched to any other audispd plugins that might be in operation.  I've thought
of some workarounds for this like (a) have the users turn off the auditd logging and perform filtered logging from my plugin
instead, (b) replace the audispd dispatcher completely so I can filter out extraneous records from other downstream plugins, (c)
some combination of a and b, or various other schemes.  Anybody have an better ideas on this?

(AIX has a cool feature that allows audit listeners to specify a "class" (subest) of audit event types that they want to receive.
That way any application interested in receiving audit events can define which events it wants without affecting which events other
listeners receive.  Sort of a built-in filtering mechanism.  It would be cool to see that in the Linux audit subsystem someday)

4) Am I crazy for even thinking this might work okay?

Thanks to all for any feedback you're willing to provide.

- Andy