[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS limits?

Brian Jackson wrote:

The code that most people on this list are interested in currently is the code in cvs which is for 2.6 only. 2.6 has a config option to enable using devices larger than 2TB. I'm still reading through all the GFS code, but it's still architecturally the same as when it was closed source, so I'm pretty sure most of my knowledge from OpenGFS will still apply. GFS uses 64bit values internally, so you can have very large filesystems (larger than PBs).

This is nice. I was specifically thinking of 64bit machines, in which case, I'd expect it to be 9EB or something.

Our current (homegrown) solution will scale very well for quite some
time, but eventually we're going to get saturated with write requests to
individual head units.  Does GFS intelligently "spread the load" among
multiple storage entities for writing under high load?

No, each node that mounts has direct access to the storage. It writes
just like any other fs, when it can.

So, if I have a dozen seperate arrays in a given cluster, it will write data linearly to array #1, then array #2, then array #3? If that's the case, GFS doesn't solve my biggest fear - write performance with a huge influx of data. I'd hoped it might somehow "stripe" the data across individual units so that we can aggregate the combined interface bandwidth to some extent.

Does it always
write to any available storage units, or are there thresholds where it
expands the pool of units it writes to?  (I'm not sure I'm making much
sense, but we'll see if any of you grok it :)

I think you may have a little misconception about just what GFS is.
You should check the WHATIS_OpenGFS doc at
http://opengfs.sourceforge.net/docs.php It says OpenGFS, but for the
most part, the same stuff applies to GFS.

I've read it, and quite a few other documents and whitepapers on GFS quite a few times, but perhaps you're right - I must be missing something. More on this below...

I notice the pricing for GFS is $2200.  Is that per seat?  And if so,
what's a "seat"?  Each client?  Each server with storage participating
in the cluster?  Both?  Some other distinction?

Now I definitely know you have some misconception. GFS doesn't have
any concept of server and client. All nodes mount the fs directly
since they are all directly connected to the storage.

Hmm, yes, this is probably my sticking point. It was my understanding (or maybe just my hope?) that servers could participate as "storage units" in the cluster by exporting their block devices, in addition to FC or iSCSI or whatever devices which aren't techincally 'servers'.

In other words, I was thinking/hoping that the cluster consisted of block units aggregated into a filesystem, and that the filesystem could consist of FC RAID devices, iSCSI solutions, and "dumb servers" that just exported their local disks to the cluster FS.

Am I totally wrong? I guess it's GNDB I don't totally understand, so I'd better go read up on it.



fn:Don MacAskill
adr:;;3347 Shady Spring Lane;Mountain View;CA;94043;USA
email;internet:don smugmug com
tel;fax:(650) 641-3125

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]