libaudit large stack requirement in audit_send()

Steve Grubb sgrubb at redhat.com
Mon Apr 29 16:48:48 UTC 2013


On Monday, April 29, 2013 11:15:16 AM Luciano Chavez wrote:
> Hello,
> 
> I am working with a development team developing a J2EE application. They
> reported a problem with a crash in audit_send(). The crash occurred in a
> ppc64 architecture environment early on in the invocation to audit send.
> 
> The crash occurs in this instruction which is establishing the size of
> the local stack:
> 
> => 0xfff73237994 <audit_send+52>:	stdu    r1,-27232(r1)
> 
> I found one large struct defined to a local variable
> 
> (gdb) print sizeof(struct audit_message)
> $4 = 8988
> 
> but you will note that it asks for much more than that and after looking
> at the audit_send() routine, it calls a function called check_ack()

At the time it calls check_ack, its done with that variable and the compiler 
could have retired that area of the stack for reuse by something else.

> which appears to be inlined 

It doesn't have the inline keyword.


> and it contains two even larger definitions
> on the stack for the following structure:
> 
> struct audit_reply
> 
> (gdb) print sizeof(struct audit_reply)
> $3 = 9016

They should be the same size. One is in an if statement that only executes on 
a netlink message error. That should not be a normal case.


> So, the combination of the three is what requires almost 26.5K of local
> stack usage in this frame alone.
> 
> Is there a requirement for libaudit to have the structs on the stack
> versus allocated from heap? Is so, is this requirement documented
> somewhere?

The audit daemon has to be reliable, It was designed to use as few malloc's as 
possible so that nothing unexpected can happen, like memory fragmentation or 
stopping because an allocation failed.

 
> To be fair, the Java application has some heavy stack usage as it is
> since it is deployed in a web application server and there is a JNI
> function that is somewhere in the call stack as well. However, the stack
> usage in the audit_send() function seems ... excessive.

Well, if you are within 26k of crashing due to being out of stack, then you 
might just be shifting the problem to something else cause you're about out of 
stack. :-)  To me, it sounds a bit like the compiler could be more aggressive 
on reclaiming the unused stack area once it sees stack variables expire.


> Originally the thread stacksize size was set to 256K and that did not
> help but once we raised it to 1MB it did but I think that is probably
> more than we really need.
> 
> I have looked at the source for the audit 2.2.3 release from March and
> don't see a difference in how the structs are allocated. So once again,
> if there is not a requirement that the structs be on the stack, should
> they not be allocated off the heap?

Well, except that I am trying real hard to avoid malloc in all functions that 
are executed by auditd. You'll find it in other places, like library functions 
that are used by other apps or things that parse and analyze. They aren't as 
critical as auditd staying running.

I'm open to ideas. I don't know if putting req.nlh.nlmsg_len into a tmp 
variable would help the compiler realise it can reclaim the area occupied by 
req or not. I might be able to get rid of rep2 by reusing the area of rep 
after coping out the important part of it. But this part of the code hasn't 
significantly changed in 7-8 years.

-Steve




More information about the Linux-audit mailing list