Decoding arguments passed to system calls

Wed Jul 4 14:23:51 UTC 2007

On Monday 02 July 2007 05:46:26 pm Darryl Dixon - Winterhouse Consulting 
wrote:
> Forgive me if this isn't the correct forum for this,

This is the correct forum for this.  :)

> Scenario:
> A very large filesystem with potentially millions of files in an ad-hoc,
> unordered directory structure. The requirement is to be able to audit any
> action on any file in this filesystem (moves, adds, changes, deletes,
> etc).

The new directory auditing capabilities should solve this problem. It will be 
in 2.6.23 and RHEL5.1 kernels.

> Hypothetical solution:
> Clearly, scanning the filesystem with `find` and adding calling auditctl
> with the appropriate arguments to generate a watch on every singly file is
> totally infeasible

That's where the directory auditing directives will help. You should only have 
to place and audit watch ob the directory while using audit package 1.5.4 or 
higher. The kernel will do the rest.

> (find takes almost an hour to run, and in the meantime 
> stuff is potentially changing...). Instead, I envision it would make
> better sense to simply audit every call to write(), open(), rename(), etc,

You could audit open, rename, symlink, unlink, etc. But don't audit write or 
read (it will fill your disk and won't cover the cases where the file is 
memory mapped). Assume that opening with write flags means it will be written 
to. You could do this and limit the auditing by dev major and dev minor so 
that you don't get too many records.

> My problem is that this doesn't seem possible with the Linux Audit
> subsystem, as the arguments to the system calls are not decoded (eg, the
> audit records for write() include only an opaque filehandle and pointer to
> the written data, etc). 

They are decoded if you pass the -i, but if the information is not collected, 
there is nothing to decode. In the case or write, the path is not collected. 
The path is collected at the open and the return code is the descriptor 
number if its non-negative.

> 1) Am I totally wrong and there's a method of getting this information
> already that I have overlooked?

Which kernel are you using? I think the directory auditing will solve your 
problem. But note what I was saying about reads & writes and mmap.

> 2) Knowing very little about the auditing subsystem, and the kernel
> internals in general I envision that decoding the filehandle into a path
> is something that would need to be done in the kernel, and is impossible
> from userland. Is this the case?

Its possible. you would have to do it yourself. I'd suggest using the auparse 
library.

> 3) How much work do you all estimate that it would actually take to be
> able to generate this information? 

probably a lot. The audit systems view of the word is similar to strace except 
from the kernel's PoV. There is no data retained between syscalls about 
syscalls.

> Is it even possible without a major architectural overhaul of the audit
> subsystem? 

You can certainly do it from user space.

-Steve