Issue #8 June 2005

Red Hat® GFS vs. NFS: Improving performance and scalability

Data sharing is a must in today's modern computing world. When the data requirements include allowing a cluster of servers to access a common storage pool, Red Hat GFS is the answer for simplifying your data infrastructure, minimizing storage costs, adding storage on the fly, and achieving maximum uptime.

A cluster file system like Red Hat GFS can be used with an IP network block-sharing protocol like iSCSI to provide scalable file serving at low cost. Network File System (NFS) is a common shared storage solution utilized by many infrastructures. However, in some instances, this solution does not scale. How do GFS and NFS compare? This article explains.

Red Hat® Enterprise Linux® customers experiencing NFS performance problems who only have Linux NFS clients can very likely improve their performance and scalability using Red Hat GFS with iSCSI-based IP block networking.

The comparison

NFS is a popular file-sharing protocol for Linux and UNIX. Figure 1. Comparing NFS and Red Hat GFS/iSCSI hardware topologies shows a series of comparisons between different NFS and GFS data sharing hardware topologies in a server cluster. Figure 1. Comparing NFS and Red Hat GFS/iSCSI hardware topologies shows the classic and most common NFS deployment: a single NFS server with its own local storage connected to a number of clients on the network. The GFS data sharing cluster constructed with an iSCSI server has the exact same hardware topology and, in practice, better performance. In addition, unlike NFS, GFS achieves the same POSIX-compliant behavior as a local file system. This means that distributed Linux applications can achieve good performance accessing shared files across the cluster with the same behavior as a POSIX-compliant local file system.

Note:
In particular, NFS does not support the same file synchronization semantics that UNIX (and Linux) processes support: In UNIX, if one process writes to a file, another process reading that file at a later time is guaranteed to see the write. There are no such guarantees in NFS unless particular write caching settings are used which can negatively impact performance.
Comparing NFS and Red Hat GFS/iSCSI hardware topologies
Figure 1. Comparing NFS and Red Hat GFS/iSCSI hardware topologies

Figure 2. Paired NFS and Red Hat GFS/iSCSI servers shows two NFS servers organized as a failover pair with a Storage Area Network (SAN) backend; a comparable data sharing cluster topology is shown that includes two iSCSI servers with shared SAN storage. As in Figure 1. Comparing NFS and Red Hat GFS/iSCSI hardware topologies, the physical topology is the same, yet the functionality available in the two systems has diverged even further. The NFS servers merely act as a failover pair: they do not share files, only physical block storage (each NFS server exports a local file system that is mapped to a shared volume on the SAN). Two separate file systems must be maintained, and only one NFS server at a time can provide processing power to handle the NFS requests to a particular file system. In contrast, in the data sharing cluster a single shared file system can be mapped to the SAN storage: The two iSCSI servers collaborate to provide access to the shared file system. If an iSCSI server fails, the GFS server nodes can route around the failure and access the storage through the iSCSI server that is still operating.

Paired NFS and Red Hat GFS/iSCSI servers
Figure 2. Paired NFS and Red Hat GFS/iSCSI servers

Figure 3. Multiple NFS servers and a Red Hat GFS/iSCSI data sharing cluster shows the system topologies scaled even further to four NFS and four iSCSI servers. Note that the NFS servers, unlike the iSCSI nodes, do not have a SAN for their storage. This emphasizes that each NFS server can provide access only to a single file system; it's impossible to add more NFS servers to increase the processing power available to serve up a particular file system. In contrast, the four iSCSI servers are connected to shared storage via a SAN, and all four servers provide processing capacity for the GFS servers. In fact, more iSCSI servers can be added to incrementally add as much processing capability as necessary to match the SAN system performance with the performance goals of the GFS servers in the data sharing cluster. Storage capacity can also be added incrementally to the SAN and then provisioned across one or more file systems. The four NFS servers in Figure 3. Multiple NFS servers and a Red Hat GFS/iSCSI data sharing cluster are really four separate islands of storage, which inhibits performance and efficient provisioning of storage across the NFS servers.

Multiple NFS servers and a Red Hat GFS/iSCSI data sharing cluster
Figure 3. Multiple NFS servers and a Red Hat GFS/iSCSI data sharing cluster

Summary

Red Hat GFS can be combined with iSCSI storage networks to provide better performance than that achievable with NFS alone.

GFS/iSCSI versus NFS GFS/iSCSI NFS
Client scalability 300+ or more 10-20 or fewer for bandwidth-heavy workloads
Bandwidth to/from server clients Unlimited Limited by max NFS server bandwidth
Complexity at large scale Ability to keep single name space on same volume limits complexity, simplifies management At large scale, NFS file systems must be spread across several distinct volumes, increasing management complexity and limiting performance achievable
POSIX semantics Yes, reads always get data that was last written, does not break applications No, reads may or may not get last data written
Small-scale, low-performance environments GFS/iSCSI can scale down cost effectively to small-scale, low performance environments NFS works well at small scale (5-10 clients) and in low performance environments
Table 1. Comparing GFS/iSCSI and NFS

Further reading

To learn more about Red Hat GFS and Red Hat Enterprise Linux check out the following websites for additional information:

About the author

From 1990 to May 2000, Matthew O'Keefe taught and performed research in storage systems and parallel simulation software as a professor of electrical and computer engineering at the University of Minnesota. He founded Sistina Software in May of 2000 to develop storage infrastructure software for Linux, including the Global File System (GFS) and the Linux Logical Volume Manager (LVM). Sistina was acquired by Red Hat in December 2003, where Matthew now directs storage software strategy.