[dm-devel] dm-mpath: do not change SCSI device handler

Mike Snitzer snitzer at redhat.com
Thu Apr 4 14:20:48 UTC 2013


On Thu, Apr 04 2013 at  9:36am -0400,
Mikulas Patocka <mpatocka at redhat.com> wrote:

> 
> 
> On Thu, 4 Apr 2013, Mike Snitzer wrote:
> 
> > On Thu, Apr 04 2013 at  8:55am -0400,
> > Mikulas Patocka <mpatocka at redhat.com> wrote:
> > 
> > > 
> > > 
> > > On Thu, 4 Apr 2013, Mike Snitzer wrote:
> > > 
> > > > I'll take a look at fixing this by deferring the scsi_dh switch until
> > > > resume.  This fix would assume multipath-tools is _not_ doing a noflush
> > > > suspend/resume when it is switching the scsi_dh.
> > > > 
> > > > Mike
> > > 
> > > This won't work because scsi_dh_attach allocates memory and you can't 
> > > allocate memory when something is suspended.
> > 
> > Ah yeah, scsi_dh->attach allocates memory for scsi_dh_data.  But
> > couldn't those scsi_dh_* attach allocations be switched from GFP_KERNEL
> > to GFP_NOIO?
> 
> Yes and no.
> 
> GFP_NOIO allocations don't issue any IO, so they have higher possibility 
> of failure - if the memory is full of user space pages or dirty file pages 
> and there are no clean cache pages, then GFP_NOIO allocation can't make 
> any progress and fails. GFP_KERNEL allocation could swap out some pages 
> and succeed.
> 
> On the other hand, kernel developers use GFP_NOIO allocations and assume 
> that they don't fail. And they don't fail most of the time. Although I 
> have seen cases where it failed and caused trouble - when allocating 
> inodes with GFP_NOIO under high memory stress. (the correct solution would 
> be to allocate the inode with GFP_KERNEL earlier, when we are not holding 
> any filesystem lock).
> 
> >From the point of formal correctness, relying on GFP_NOIO is wrong, but if 
> the allocated space is small and it happens infrequently (only on 
> activation), you can likely get away with it without being caught.

I'm suggesting that switching the scsi_dh is not something that will be
done on a system that is suffering from serious memory contention.

But I think we need to get back to analyzing the scsi_dh change you
mentioned before with tracking counts, etc.




More information about the dm-devel mailing list