[dm-devel] [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

Wed Sep 5 03:57:59 UTC 2012

On Tue, Sep 04, 2012 at 11:26:33AM -0700, Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 04, 2012 at 09:54:23AM -0400, Vivek Goyal wrote:
> > > Given that we are working around stack depth issues in the
> > > filesystems already in several places, and now it seems like there's
> > > a reason to work around it in the block layers as well, shouldn't we
> > > simply increase the default stack size rather than introduce
> > > complexity and performance regressions to try and work around not
> > > having enough stack?
> > 
> > Dave,
> > 
> > In this particular instance, we really don't have any bug reports of
> > stack overflowing. Just discussing what will happen if we make 
> > generic_make_request() recursive again.
> 
> I think there was one and that's why we added the bio_list thing.

There was more than one - it was a regular enough to be considered a
feature... :/

> > > I mean, we can deal with it like the ia32 4k stack issue was dealt
> > > with (i.e. ignore those stupid XFS people, that's an XFS bug), or
> > > we can face the reality that storage stacks have become so complex
> > > that 8k is no longer a big enough stack for a modern system....
> > 
> > So first question will be, what's the right stack size? If we make
> > generic_make_request() recursive, then at some storage stack depth we will
> > overflow stack anyway (if we have created too deep a stack). Hence
> > keeping current logic kind of makes sense as in theory we can support
> > arbitrary depth of storage stack.
> 
> But, yeah, this can't be solved by enlarging the stack size.  The
> upper limit is unbound.

Sure, but recursion issue is isolated to the block layer.

If we can still submit IO directly through the block layer without
pushing it off to a work queue, then the overall stack usage problem
still exists. But if the block layer always pushes the IO off into
another workqueue to avoid stack overflows, then the context
switches are going to cause significant performance regressions for
high IOPS workloads.  I don't really like either situation.

So while you are discussing stack issues, think a little about the
bigger picture outside of the immediate issue at hand - a better
solution for everyone might pop up....

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com