[Cluster-devel] [PATCH] GFS2: directly write blocks past i_size

Benjamin Marzinski bmarzins at redhat.com
Fri Mar 18 02:53:22 UTC 2011


> Should this be WRITE_SYNC or WRITE_SYNC_PLUG I wonder?
> 

I tried these, but they didn't cause any improvement.  That doesn't seem
terribly surprising, since they both set NOIDLE, which tells the cfq
scheduler not to wait for more IO from this process.  I event tried
sending all the buffers except the last with WRITE_ODIRECT_PLUG, and
then the last buffer from each page with WRITE_SYNC, which seems like it
should work better, but in all the tests I did, it didn't appear to make
any difference.  I'm going to come back to this code after the release,
and see if I can find any way to make back some of the lost performance. 

As it is right now, fallocates that grow i_size are doing better than
double the speed of dd. fallocates that go past i_size are an order of
magnitude slower than dd.  But at least they work correctly.

> > +		}
> > +		offset += blksize;
> > +		bh = bh->b_this_page;
> > +	}
> > +	if (!waiting) {
> > +		waiting = 1;
> > +		goto second_pass;
> > +	}
> I think the code might be a bit cleaner if it was just written as two
> loops, one after the other since most of the loop content seems to be
> different according to weather "waiting" is set or not.
> 
> Otherwise I think this is a good solution,
> 
> Steve.
> 
> > +	return 0;
> >  }
> >  
> >  static int needs_empty_write(sector_t block, struct inode *inode)
> > @@ -643,7 +680,8 @@ static int needs_empty_write(sector_t bl
> >  	return !buffer_mapped(&bh_map);
> >  }
> >  
> > -static int write_empty_blocks(struct page *page, unsigned from, unsigned to)
> > +static int write_empty_blocks(struct page *page, unsigned from, unsigned to,
> > +			      int mode)
> >  {
> >  	struct inode *inode = page->mapping->host;
> >  	unsigned start, end, next, blksize;
> > @@ -668,7 +706,9 @@ static int write_empty_blocks(struct pag
> >  							  gfs2_block_map);
> >  				if (unlikely(ret))
> >  					return ret;
> > -				empty_write_end(page, start, end);
> > +				ret = empty_write_end(page, start, end, mode);
> > +				if (unlikely(ret))
> > +					return ret;
> >  				end = 0;
> >  			}
> >  			start = next;
> > @@ -682,7 +722,9 @@ static int write_empty_blocks(struct pag
> >  		ret = __block_write_begin(page, start, end - start, gfs2_block_map);
> >  		if (unlikely(ret))
> >  			return ret;
> > -		empty_write_end(page, start, end);
> > +		ret = empty_write_end(page, start, end, mode);
> > +		if (unlikely(ret))
> > +			return ret;
> >  	}
> >  
> >  	return 0;
> > @@ -731,7 +773,7 @@ static int fallocate_chunk(struct inode 
> >  
> >  		if (curr == end)
> >  			to = end_offset;
> > -		error = write_empty_blocks(page, from, to);
> > +		error = write_empty_blocks(page, from, to, mode);
> >  		if (!error && offset + to > inode->i_size &&
> >  		    !(mode & FALLOC_FL_KEEP_SIZE)) {
> >  			i_size_write(inode, offset + to);
> > 
> 




More information about the Cluster-devel mailing list