auditd performance

Steve Grubb sgrubb at redhat.com
Sun Dec 27 15:30:56 UTC 2015


Hello,

I've been looking into auditd's performance. The first thing I did was to 
measure the rate at which it could log things with various settings.  To do 
this test, I had 2 windows open. One to start auditd from the command line 
without systemd interference and one to run a script as follows:

auditctl -D
auditctl -b 16440
auditctl -f 0
auditctl --backlog_wait_time 100
auditctl -a always,exit -F arch=x86_64 -S all
sleep 3
service auditd stop
auditctl -D

The results of various settings are as follows:

FLUSH		FREQ		Events/sec
------------------------------------------------------
SYNC					45
DATA					105
INCREMENTAL	20			400
			50			1000
			100			1815
			200			3080
			400			5800
			1000		10100
			2000		15275
			4000		18650
			8000		24075
NONE					38300


In looking further, I found that there was a lot of lock contention and 
scheduling issues because of pthreads. I mapped out the paths in the code to 
get a picture of where events come from and where they go:

http://people.readhat.com/sgrubb/audit/auditd-data-flow.pdf

The blue boxes are where events come from, the red boxes are where we have 
contention. The gray is the path on the logging thread. The white boxes are 
the main thread.

What I found is that if I make enqueue_event call write_to_log directly, it 
doubles the throughput of the audit daemon. IOW, going from multi-threaded to 
singly threaded makes a huge difference. The audit daemon was multi-threaded 
from the very first public release back in 2004 before I started working on it.

So, what I think I am going to do is fix it to be singly threaded, fix the 
signal handlers to set a variable on error so that the main thread picks it up 
to serialize it with other events, move size check and rotate code, and remove 
the pthreads code.

That leaves an issue with dispatching events to other programs. What I have 
been thinking about is perhaps using libevfibers to manage switching between 
logging and dispatching.

One other tidbit that I found out during testing, if I generate so many events 
that it overflows the kernel queue, the default settings for backlog_wait_time 
makes the system unusable. It acts like its live-locked. So, I would recommend 
that the default setting in the kernel be changed to something more livable 
and anyone concerned about this to explicitly set the value to something low.

-Steve




More information about the Linux-audit mailing list