CONFIG_DEBUG_STACKOVERFLOW hurts

Sat Sep 15 13:01:37 UTC 2007

On Fri, 2007-09-14 at 22:07 -0500, Eric Sandeen wrote:
> Gilboa Davara wrote:
> 
> > Sorry for butting in... but isn't disabling STACKOVERFLOW the wrong
> > answer to this problem?
> > Does anyone see any reason why both sprint_symbol and __print_symbol
> > shouldn't use dynamically allocated buffers instead of wasting stack
> > space? *
> > 
> > - GIlboa
> > * If performance is an issue, memory can be statically allocated per CPU
> > with additional locking in dump_trace. 
> 
> Well, I agree that the dump_stack path should be lightened up
> stack-wise; and I don't think performance should be an issue (dump_stack
> is used when something has gone wrong, probably not going to be
> performance critical?)  Locked global buffers may be just fine (we did
> this for xfs error messages, I remember...)

I chose the easy wait out and generated a simple, alloc-by-demand patch.
http://lkml.org/lkml/2007/9/15/69

> 
> I was looking at this from a slightly different angle, which is that the
> stack overflow warning is largely pointless - no matter how much you
> lighten up the dump_stack path, it will add something to the stack depth
> of the current process, effectively *reducing* the available stack for
> all processes, and increasing the risk that you'll actually overflow.
> (if you take an interrupt towards the end of the stack, the warning will
> go off and use the last bit - so you can't count on that stack space to
> be available).

While it is true,
A. If adding ~40 bytes to the kernel's stack usage is critical, we're
already passed the all-doom-and-gloom-point.
B. We can always calculate the available stack size, and if stack_remain
is bigger then say, 80 bytes, call dump_stack.

> 
> And, if you overflow the stack, you'll almost certainly get an oops and
> a backtrace anyway - usually thread_info gets overwritten and you BUG
> because it looks like you sleep in an interrupt, or somesuch.  

Yeah, but at least to me, as a developer, having a warning before
all-hell-breaks-lose, is a good thing (tm). 

Though, one can always argue that people who play around with kernel
development can build their own kernel with STACKOVERFLOW enabled.

> So,
> what's the point of the IRQ stack-depth check, again?  Especially with
> 4k stacks and separate IRQ stacks?  And the more deterministic
> max-stack-depth excursion checker (CONFIG_DEBUG_STACK_USAGE) as well...
> 
> Finally, the patch I sent upstream would clearly show on an oops whether
> or not the stack was currently overflowing, or whether the stack had
> ever overflowed prior to the oops.  Seemed useful to me.

Just to satisfy my curiosity, can you post a link to the patch?

- Gilboa