[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] dlm and IO speed problem <er, might wanna get a coffee first ; )>



christopher barry wrote:
On Tue, 2008-04-08 at 09:37 -0500, Wendy Cheng wrote:
gordan bobich net wrote:
my setup:
6 rh4.5 nodes, gfs1 v6.1, behind redundant LVS directors. I know it's
not new stuff, but corporate standards dictated the rev of rhat.
[...]
I'm noticing huge differences in compile times - or any home file access
really - when doing stuff in the same home directory on the gfs on
different nodes. For instance, the same compile on one node is ~12
minutes - on another it's 18 minutes or more (not running concurrently).
I'm also seeing weird random pauses in writes, like saving a file in vi,
what would normally take less than a second, may take up to 10 seconds.

Anyway, thought I would re-connect to you all and let you know how this
worked out. We ended up scrapping gfs. Not because it's not a great fs,
but because I was using it in a way that was playing to it's weak
points. I had a lot of time and energy invested in it, and it was hard
to let it go. Turns out that connecting to the NetApp filer via nfs is
faster for this workload. I couldn't believe it either, as my bonnie and
dd type tests showed gfs to be faster. But for the use case of large
sets of very small files, and lots of stats going on, gfs simply cannot
compete with NetApp's nfs implementation. GFS is an excellent fs, and it
has it's place in the landscape - but for a development build system,
the NetApp is simply phenomenal.

Assuming you run both configurations (nfs-wafl vs. gfs-san) on the very same netapp box (?) ...

Both configurations have their pros and cons. The wafl-nfs runs on native mode that certainly has its advantages - you've made a good choice but the latter (gfs-on-netapp san) can work well in other situations. The biggest problem with your original configuration is the load-balancer. The round-robin (and its variants) scheduling will not work well if you have a write intensive workload that needs to fight for locks between multiple GFS nodes. IIRC, there are gfs customers running on build-compile development environment. They normally assign groups of users on different GFS nodes, say user id starting with a-e on node 1, f-j on node2, etc.

One encouraging news from this email is gfs-netapp-san runs well on bonnie. GFS1 has been struggling with bonnie (large amount of smaller files within one single node) for a very long time. One of the reasons is its block allocation tends to get spread across the disk whenever there are resource group contentions. It is very difficult for linux IO scheduler to merge these blocks within one single server. When the workload becomes IO-bound, the locks are subsequently stalled and everything start to snow-ball after that. Netapp SAN has one more layer of block allocation indirection within its firmware and its write speed is "phenomenal" (I'm borrowing your words ;) ), mostly to do with the NVRAM where it can aggressively cache write data - this helps GFS to relieve its small file issue quite well.

-- Wendy


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]