[Linux-cluster] io scheduler and gnbd

Sat Oct 21 00:46:15 UTC 2006

On Fri, Oct 20, 2006 at 04:20:37PM +0200, Markus Hochholdinger wrote:
> hi,
> 
> i'm succesfully using gnbd as a single service for a long time. Now i 
> discovered a weired problem with the gnbd devices with kernel 2.6.18. I build 
> the gnbd.ko module out of the cvs tree.
> All works fine if you don't do to much an the gnbds. But if you stress test 
> the devices, the gnbds will hang, e.g. reads and writes hang. If you restart 
> the gnbd server, the client will continue to read and write until the next 
> hang.
> So i first checked my gnbd servers and tried from 1.01 till 1.03 and the 
> latest cvs. But the problem is still there. From another gnbd client i had no 
> problem, with none of these gnbd server versions (i was impressed you can mix 
> these versions). Also changing the kernel on the gnbd server didn't helped.
> 
> So i was stick to the gnbd client with kernel 2.6.18. I have to use this 
> kernel because of the new hardware. So i tried a little and found out that 
> changing the default io scheduler for the gnbd devices on the client makes 
> the hanging write and reads resume. The default scheduler was cfq and with 
> this i can easily reproduce this behavior. With the deadline scheduler it 
> doesn't.
> 
> So i read a little about io scheduling on linux. And my assumption is a gnbd 
> device shouldn't need any io scheduling, because the network has no latency 
> when seeking like a hard disk. On the gnbd server there are getting request 
> from more than one gnbd client, so scheduled io on the client would mix up 
> the scheduling on the server. And also the server does its own io scheduling 
> when writing to the real disk.
> So i could use the noop scheduler or have i missed something?
> 
> Has anyone on the list more info about io scheduling and gnbd?
> 
If gnbd isn't working with the latest kernel, it's pretty definitely a bug. I'll
take a look and see if I can't reproduce it.  As far as IO scheduling goes,
if you can work around this with a different scheduler, that's great, but
depending on your IO patterns, gnbd should see a benefit from reordering
requests.  It is more efficient to send a fewer number of larger requests to the
server. There are a couple or reasons why but the big one is that the gnbd
server does not reorder requests itself.

Currently the gnbd server receives a request, performs the request, returns
the result, and then goes on to the next request. There is only one thread per
client per device. Obviously, your need to have the IO complete to disk before
you return the read result. But because gnbd is pretending to be a block device,
when the server says that the data has been written out, the data must be
actually on disk. This means that the request must be synced to disk before the
server returns a write result and goes on to the next request. So the gnbd
server always has it's requests complete to disk before it gets a new one, so
it cannot usefully reorder them.  It can reorder requests if they come in from
different clients, but I don't think that this gets you much.

Now that (I believe) you can do async IO to a device opened with the O_SYNC
flag from userspace, the gnbd server could be rewritten much more effectively.
Unfortuntely, it probably won't happen anytime soon. 

Thanks for the heads up, and if you wouldn't mind filing a bugzilla about this
at bugzilla.redhat.com, that would be helpful.

-Ben

> 
> -- 
> greetings
> 
> eMHa

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster