[Linux-cluster] GFS Feature Question

Gordan Bobic gordan at bobich.net
Fri Oct 5 15:49:57 UTC 2007


Hi,

I stumbled upon an old document from back in 2000 (before RedHat acquired 
Sistina), and they were talking about a number of features for the "next 
version", including shadowing/copy-on-write.

The two features I am particularly interested in are:

1) Compression
I consider this to be important both for performance reasons and the fact 
that no matter how cheap, disks will always be more expensive. 
Performance-wise, at some point I/O becomes the bottleneck. Not 
necessarily the disk I/O but network I/O of the SAN, especially when all 
the nodes in the cluster are sharing the same SAN bandwidth. At that 
point, reducing the data volume through compression becomes a performance 
win. This point isn't all that difficult to reach even on a small cluster 
on gigabit ethernet.

2) Shadowing/Copy-On-Write File Versioning
Backups have 2 purposes - retrieving a file that was lost or corrupted 
through user error, and files lost or corrupted through disk failure. High 
levels of RAID alleviate the need for backup for the latter reason, but 
they do nothing to alleviate user-error caused damage. At the same time 
SANs can get big - I don't see hundreds of TB to be an inconcievable size. 
At this size, backups become an issue. Thus, a feature to provide file 
versioning is important.

In turn, 2) increases the volume of data, which increases the need for 1).

Are either of these two features planned for GFS in the near future?

Gordan




More information about the Linux-cluster mailing list