[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Fwd: Re: [Cluster-devel] What's the issue with read-ahead ?]



Sorry, sent to the wrong list.... Wendy

--- Begin Message ---
Mathieu Avila wrote:
I'm sure it is a bad solution. That's a very bad place to put this.

Setting it in the diaper device with the value fetched from the
original device, or the tunable value, is a better option. Something
like this patch:
http://www.redhat.com/archives/linux-cluster/2005-November/msg00016.html
(Readhead Issues using cluster-1.01.00)
Thanks for the pointer. I was not aware of this discussion. Actually for directory read, journal mode, Direct IO, etc, GFS readahead does work as expected. The problem is only with buffer mode read. Apparently the diaper device implementation "accidentally" forces GFS buffer mode read to bypass a significant piece of vfs layer performance enhancement. The annoying part of this issue is that among all of the GFS components, other than Direct IO, that could really take this readahead advantage is the buffer mode read.

For a while, GFS has been kind of detached from upstream kernel development. With GFS2 now in Linux main line tree, hopefully this problem would disappear.

I have read some documentation in the time between.
( http://opengfs.sourceforge.net/showdoc.php?docpath=cvsmirror/opengfs/docs/ogfs-locking
i think the principles are still up-to-date, for GFS1. Tell me if i'm
wrong, please)
I'll go to read this when I'm off my current task ...
From what i've understood, the inode GLock concerns the whole inode.
There cannot be 2 nodes using the same file at the same time. If so,
there is a ping-pong of the GLock between the nodes. When the node
releases the glock, it flushes all dirty pages associated to it. When
it gets back the GLock, it must invalidate all pages associated to this
inode, so that it will read/write what was written by the last node
using the inode.

This is correct.

"Range locking" (in the unix sense) is done only at the node level. Two
processes on two different nodes won't read/write the same file at the
same time, even if they have a "unix lock" on it. (That would be
the reason why using the same files/directories on 2 nodes is valid,
but not recommanded.)
This is not correct. GFS supports "range locking" based on Posix standard. Two processes can read/write to the same file at the same time, as long as posix locks are held. However, there are nontrivial performance hits if you overdo this.

Is this still true ? If so, doing read-ahead this way is perfectly
valid in this case, there will be no coherency/corruption problem, isn't
it ?

No, you don't want readahead if you ping-pong the locks. Each lock transfer will invalidate all the pages associated with that file - it makes your readahead useless.
-- Wendy


--- End Message ---

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]