NFS Help! Terrible performance with sync, fast performance with async

Ryan Dimbleby ryan at rdsd.net
Mon Nov 20 14:40:10 UTC 2006


Hello.

I had the same issue and raised it with RedHat Support. They pointed me at this

http://kbase.redhat.com/faq/FAQ_45_8122.shtm

Basically RHEL v3 as a client is much slower than a rhel v4 client.

You can try their recommendations here:

http://kbase.redhat.com/faq/FAQ_85_7492.shtm

However, as you will find  - rhel v3 does not support larger rsize and wsize windows in the kernel, rhel v4 does.

You are correct in your analysis in sync / async behaviour. This same technique is used in NFS acceleration technologies such as NetApp storage, they do however use specialist hardware to ensure data makes it to disk to avoid data loss / corruption, of course this costs $$$ and I suspect this won't be an option for you.

I recommend you try rhel 4.

hth

-

ryan

---------- Original Message -----------
From: "Chris Wornell" <CWornell at peerless.com> 
To: <redhat-sysadmin-list at redhat.com> 
Sent: Sun, 19 Nov 2006 22:14:37 -0500 
Subject: RE: NFS Help! Terrible performance with sync, fast performance with async

> >Are you using tcp as transport?

>

 

> I’ve tried both UDP and TCP and the same results on both.  My understanding is that if you have only one switch between the devices, then UDP should be fine as long as there isn’t any packet loss. 

>

 

> From what I can tell, its not so much the transport of data between the client and server, just the commit time on the server.  I think the steps are:

>

 

> ·         NFS server receives data

> ·         NFS server requests a commit of the data

> ·         Data gets handed off to the file system

> ·         File system writes the data

> ·         A response is sent back to the NFS server that the commit was successful

> ·         NFS server sends a response back to the client that it can send more data

>

 

> I’m trying to get more detail on this process though and the above is just an educated guess.  I even increased the amount of nfsd processes (the daemons) from 8 to the 32 with the exact same results.  Changing the rsize and wsize didn’t do anything either.  From my understanding, meta-data changes are small so increasing the size of the data chunks shouldn’t really make a difference. 

>

 

> 
>

-----------------------------------------------------------------------

> From: redhat-sysadmin-list-bounces at redhat.com [mailto:redhat-sysadmin-list-bounces at redhat.com] On Behalf Of Chris Wornell
> Sent: Sunday, November 19, 2006 1:59 AM
> To: redhat-sysadmin-list at redhat.com
> Subject: NFS Help! Terrible performance with sync,fast performance with async

>

 

> I've got a problem that I've spent quite a bit of time on, though I'm not an expert at NFS. In summary, operations that require meta-data changes (such as file/directory creations/deletions), perform extremely slow over sync, but over 10x faster using async.
> 
> I have two systems, connected to a GigE switch using intel pro 1000 NICs (jumbo frames is currently not enabled on any of the points). 
> 
> The NFS server is a dual-core opteron system with 1GB of RAM and 3x300 SAS disk RAID-5 on a Perc5/i controller with 256MB battery backed cache (write cache is enabled). The file system is ext3. I've configured nfsd to spawn 32 processes upon startup. I'm using defaults for export the nfs shares, no changes to rsize or wsize.
> 
> The NFS client is a dual Xeon with 4GB of RAM and a single 7200rpm SATA disk. Both systems are running RHEL WS 3 Update 8 and kernel 2.4.21-47.0.1.ELsmp. 
> 
> For testing, I'm using bonnie++. The following are some sample test results that sum up the problem:
> 
> Test on NFS server directly (not NFS loopback)
> -Sequential File Creation: 2976
> -Sequential File Deletion: N/A
> -Random File Creation: 3077
> -Random File Deletion: 9922
> 
> NFS test with sync enabled
> -Sequential File Creation: 39
> -Sequential File Deletion: 79
> -Random File Creation: 39
> -Random File Delection: 65
> 
> NFS test with async enabled
> -Sequential File Creation: 575
> -Sequential File Deletion: 1718
> -Random File Creation: 543
> -Random File Deletion: 1228
> 
> Based on the local performance of the NFS server, it does not appear the IO setup is the culprit. My understanding of the sync operation is a commit happens which means the NFS server doesn't reply back until the change has actually been committed to stable storage. There is something happening behind the scenes though which is causing a huge delay before the NFS server replies back the commit was complete.
> 
> This question is actually work related and I'm planning to put the NFS server into production, but I'd rather not use async, even with a UPS and dual PSU's on the server. With the newer nfs-utils, sync is the default option as well so it seems like sync should perform relatively well.
> 
> Another question is I don't quite understand how the data corruption happens if a power loss occurs on an NFS server using async. Even with sync, data transferred over the wire maybe loss if the nfs server gets shut down before that data is committed. Can anyone go into more detail on how the data corruption happens?
> 
> Thanks a bunch!

>

 

> Thanks,
> 
> Chris Wornell
> Network Administrator, Information Technology
> Peerless Systems Corporation
> http://www.peerless.com
> office: 310.727.5723
> fax: 310.727.5715
> mailto:cwornell at peerless.com

>

 

------- End of Original Message -------

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/redhat-sysadmin-list/attachments/20061120/b578123e/attachment.htm>


More information about the redhat-sysadmin-list mailing list