I’ve seen the following situation occur on 2 machines now for a total of 3 incidents:
· Audisp-remote runs normally on 5 separate servers, the problem happens on two that are configured the same as the other 3.
· Audisp-remote runs normally on the problem servers for days to weeks at a time without problems.
· For an unidentified reason (nothing that I can find in any system log) audisp-remote stops sending messages to the central log server.
· Some hours or days later (depending on audit event activity) audisp-remote consumes all system memory and swap space. In my case because of the nature of my directory tree watches for my web content this usually happens when the web content is being regenerated from scratch by our build server. The memory consumption happens very rapidly.
· One server is configured with 8GB of ram and 2GB of swap, the second server has 12GB of ram and 2GB of swap.
· The system becomes completely unresponsive until enough time goes by for some critical need for memory to arise and the OOM Killer kicks in and starts reaping enough tasks to allow me to get in and shutdown auditd.
· At this point the system returns to normal, and if I restart auditd it resumes normal operation.
Here is a ps aux taken when it happened today on the 12GB machine:
USER PID %CPU %MEM VSZ RSS T TY STAT START TIME COMMAND
root 1106 0.0 0.0 0 0 ? S< Oct12 0:36 [kauditd]
root 4768 0.1 0.0 92880 500 ? S<sl Oct17 26:22 auditd
root 4770 0.2 0.1 212984 12984 ? S<sl Oct17 31:49 /sbin/audispd
root 4771 0.0 96.7 28631936 11899072 ? S< Oct17 7:52 /sbin/audisp-remote
Priorities for each audit task are:
All machines are fully current on maintenance. Running RedHat EL 5.5 x86_64 with the following audit package set:
All that being said, I have the following questions:
· Has anyone seen this, and if so what workarounds, or fixes are available.
· What additional data should I collect that may assist in identifying the root cause of the problem? Since it can take days for this to manifest itself it seems like traces are out of the question, but perhaps there are other collection tools that can be used.
· Are there any program options or configuration options that can be used to debug this? The man pages seem to be a bit stale in this distribution?
· Does anyone have any other ideas on what I might do to get to the bottom of this?
I also have a separate issue, that I’m curious about. Under RedHat EL 5.5 there doesn’t seem to be any limitations on the support for audisp-remote, but I just noticed in the release notes for RedHat EL 6 Beta, this component is flagged as a Technology Preview in EL 6. Does anyone know the reason for the change in status? I was planning to use this as part of my PCI-DSS compliance efforts next year but this change may make that choice problematic.
Attached please find my current auditd.conf, audispd.conf, audisp-remote.conf and au-remote.conf files.
Beyond this query I also plan to open a support incident with RedHat, but I thought that by using feedback from this group I might be in a better position to provide support with useful information to aid in problem diagnosis.
Please let me know anything else that may help to get to the bottom of this.
Thanks in advance!