Issue #8 June 2005

Red Hat® GFS: Combining Fibre Channel and Gigabit Ethernet

Introduction

Storage area networking (SAN) and local area networking (LAN) have been dominated by two different physical interfaces, Fibre Channel and Ethernet, respectively. Fibre Channel is significantly more expensive but achieves better performance on storage networking workloads, while Ethernet is inexpensive (with adequate performance for most network workloads), ubiquitous, and scalable to large numbers of ports. Figure 1. Standard SAN topology shows a standard SAN topology.

Standard SAN topology
Figure 1. Standard SAN topology

Given that Ethernet and Fibre Channel technologies are at different ends of the price and performance spectrums, yet each brings important benefits to the IT infrastructure, the question arises: Is it possible to somehow fuse these two kinds of networks together to obtain both the price and performance benefits so that it works well for your specific application or infrastructure? In other words, can we get the benefits of a SAN technology that leverages the performance, efficiency, and existing infrastructure base of Fibre Channel storage with the low-cost and network scalability of Gigabit Ethernet?

In fact, the network topology does exist and is used in nearly every SAN deployment, as shown in Figure 2. Integrated Fibre Channel and Gigabit Ethernet network. In this figure, a group of storage servers are connected via Fibre Channel to shared storage devices and connected via IP to a group of network clients. These storage servers might be serving files using a standard distributed file system protocol like NFS or CIFS, or they might be running a database application like Oracle 9i RAC and serving data from the storage servers to the network clients. In any case, servers connected to Fibre Channel storage are typically running large data center workloads, and many client computers connect to these servers to accomplish their tasks.

Integrated Fibre Channel and Gigabit Ethernet network
Figure 2. Integrated Fibre Channel and Gigabit Ethernet network

One potential task that the storage servers can perform is to re-export the SAN shared block devices out to the clients on the IP network. Just as NFS exports files from an NFS server to NFS clients, a network block server can serve out blocks to network clients. Essentially, the storage servers are multiplexing a Fibre Channel port connection onto one or more Gigabit Ethernet ports. This can be done in hardware by employing a storage networking switch that incorporates both iSCSI and Fibre Channel ports. Interest in iSCSI as an alternative storage networking protocol continues to grow and more vendor support is available every year, but today most SANs are constructed using Fibre Channel, and using switches that integrate both iSCSI and Fibre Channel allows customers to leverage their current Fibre Channel investments while migrating some of their SAN infrastructure to iSCSI.

Software protocols may also be used to multiplex Fibre Channel ports onto a Gigabit Ethernet network. One such protocol is Red Hat's Global Network Block Device (GNBD) protocol for Linux. Red Hat has deployed GNBD as a shared network block protocol for the Red Hat GFS cluster file system; iSCSI also provides shared network block storage devices over IP networks that Red Hat GFS can exploit. A cluster file system allows multiple servers attached to a SAN to share a single file system mapped onto the shared storage devices, an architecture known as a data sharing cluster. Red Hat GFS is a cluster file system that can be used to construct data sharing clusters with Linux servers.

When used with GNBD or iSCSI, Red Hat GFS runs on network clients as shown in Figure 2. Integrated Fibre Channel and Gigabit Ethernet network, in effect sharing the network block devices exported by the storage servers. This approach allows the GFS cluster to scale to hundreds of servers that mount a shared file system using standard x86 servers but without the expense of a Fibre Channel HBA and switch port associated with every machine. In large Linux clusters, this cost can be high.

For example, consider a GFS cluster with 128 servers connected via Gigabit Ethernet to 16 GNBD servers which are connected in turn to a SAN network with 16 shared storage devices. The cost of directly attaching each GFS server to the SAN would be 128 multiplied by ($3000 (the per director port cost) + $1000 (per FC HBA cost)) or roughly $500,000. GNBD would require 16 Linux GNBD servers (each with a FC HBA) at $2500 each (total of $40,000) and a 32-port FC switch ($60,000) for a total cost of $100,000, or about $400,000 less than the pure FC SAN approach.

Red Hat GNBD (or a storage switch supporting iSCSI and Fibre Channel) lets the system architect decouple the storage network port connection from the servers, de-multiplexing multiple servers onto the same storage network port connection. This approach allows the system architect to achieve specific price, performance, and storage capacity design points more easily compared to systems designed using only Fibre Channel or Gigabit Ethernet for the SAN. With the ability to independently scale storage and capacity, a wide variety of file and database applications can be addressed in a cost-effective manner. In just the same way, a SAN switch with both Fibre Channel and iSCSI ports can amortize expensive Fibre Channel connectivity across a less expensive, more scalable Ethernet-based IP network.

Using GNBD in a data sharing Linux cluster
Figure 3. Using GNBD in a data sharing Linux cluster

As shown in Figure 3. Using GNBD in a data sharing Linux cluster, four GNBD servers are exporting a set of four SAN-attached storage devices to a GFS server cluster. Each GFS server sees the four storage devices exported from the GNBD servers. Figure 4. GNBD protocol operation shows a view of GNBD operation from the standpoint of GNBD server B. Notice how each GFS server acts as a GNBD client connecting through the GNBD server via a kernel thread to a storage device. To achieve scalable performance, there is one kernel thread created per logical connection between a GNBD client and a storage device. In Figure 4. GNBD protocol operation, we see GFS server 1 connected to storage device 1 via kernel thread a; GFS server 5 is connected to storage devices 1 and 3 via threads e and g, respectively. A storage area network switch with iSCSI and Fibre Channel ports implements similar operations in hardware at the SCSI layer, translating IP-based iSCSI packets into Fibre Channel frames. The effect and advantage is the same: GFS servers can run without Fibre Channel interfaces and yet still share block devices on a network.

GNBD protocol operation
Figure 4. GNBD protocol operation

Figure 5. Multipathing support in GNBD shows how multi-pathing support can allow GFS servers to route around a failed GNBD server, in this case GNBD server C. An alternate route through GNBD server D can be used because all GNBD servers can attach to all storage devices on the SAN. In contrast, Figure 6. GNBD servers using multi-ported storage shows a group of GNBD servers using non-shared, DAS storage. Unfortunately, non-shared storage attached to a GNBD server becomes inaccessible when that server or the attached storage device fails. Its possible to use RAID hardware for this locally-attached, non-shared storage, but a GNBD server failure still makes that storage inaccessible even in this design. For the lowest price point possible where availability is not critical, this design choice might be reasonable.

Multipathing support in GNBD
Figure 5. Multipathing support in GNBD

One possible solution to this problem is to use multi-ported RAID arrays in place of strictly direct-attached storage, so that each storage array is connected to two or more GNBD servers. For example, in Figure 6. GNBD servers using non-shared storage, multi-ported storage arrays 1 and 2 could be connected to GNBD servers A and B, and storage arrays 3 and 4 could be connected to GNBD servers C and D. In this configuration, no single GNBD server failure can make the storage inaccessible, and yet the full expense of a back-end Fibre Channel SAN infrastructure is not incurred.

GNBD servers using multi-ported storage
Figure 6. GNBD servers using multi-ported storage

Another alternative configuration is given in Figure 7. GNBD serving using non-shared storage and mirroring in the cluster, which shows a GFS server attached to a group of GNBD servers where mirror volume pairs are maintained across sets of GNBD servers. These mirror pairs can be constructed using cluster volume mirroring software on the GFS servers. Notice in Figure 8. GNBD server or mirrored volume failures are tolerated how this configuration (non-shared storage devices that are mirrored across GNBD servers) allows both the GNBD server and storage device failures to occur without bringing the GFS cluster down. In this example, GNBD server B or storage device 2 has failed, so the mirror volume 2' for storage device 2 is accessed through GNBD server A. This example shows that non-shared storage can be attached to the GNBD server layer in such a way that both GNBD server and storage device failures can be tolerated, at the cost of mirroring all storage devices. Mirroring storage devices doubles the required storage hardware compared to the same capacity of non-mirrored storage. In addition, it increases the amount of storage traffic seen on the IP network (each disk block is written twice instead of just once).

GNBD serving using non-shared storage and mirroring in the cluster
Figure 7. GNBD serving using non-shared storage and mirroring in the cluster
GNBD server or mirrored volume failures are tolerated
Figure 8. GNBD server or mirrored volume failures are tolerated

Design issues integrating Fibre Channel and Ethernet-based IP networks

Several design issues are raised when considering a protocol like GNBD for extending storage devices on a Fibre Channel SAN out to servers on an IP network constructed with Gigabit Ethernet (or any other interface).

As stated before and as shown in Figure 9. Multiplexing a single Fibre Channel SAN port out to a Gigabit Ethernet IP network with GNBD, GNBD can be considered as a protocol to de-multiplex a single SAN connection out to multiple logical connections on an IP network (which are generally constructed with Gigabit Ethernet). Consider only bandwidth and make the simplifying, but reasonable, assumption that the aggregate SAN bandwidth coming out of the storage devices is equal to the aggregate SAN bandwidth into the GNBD servers. Also, let's assume that this bandwidth is sufficient to saturate the Fibre Channel connections into the GNBD servers and that each GNBD server has one Fibre Channel and one Gigabit Ethernet interface1. In this case, a GNBD server will funnel data between the GNBD clients and the SAN through the Gigabit Ethernet and Fibre Channel ports.

Multiplexing a single Fibre Channel SAN port out to a Gigabit Ethernet IP network with GNBD
Figure 9. Multiplexing a single Fibre Channel SAN port out to a Gigabit Ethernet IP network with GNBD

In general, Fibre Channel achieves about 90% of its raw speed of 100 megabytes per second (200 megabytes per second for 2-Gigabit Fibre Channel) whereas on large transfers across Gigabit Ethernet using TCP/IP, 50 megabytes per second is more common. Two Gigabit Ethernet ports for every FC port on each GNBD server would work well in this case. A rough rule of thumb for bandwidth-oriented applications is to balance the aggregate bandwidth between the SAN (Fibre Channel) and the IP network (Gigabit Ethernet) and make sure that your GNBD server has the memory and processing capacity to support as many GNBD connection kernel threads as required.

Consider another example, where two 2-Gigabit per second FC interfaces are connected to a single GNBD server (achieving 180 megabytes per second per interface and 360 Megabytes per second across the two FC ports) eight Gigabit Ethernet interfaces would be required in this case under our assumption that each Gigabit Ethernet port can transfer 50 Megabytesper second. However, if each Gigabit Ethernet interface could only achieve 30Megabytes per second, then 12 GigabitEthernet ports would be needed to balance the FC and Gigabit Ethernet networks attached to the GNBD server.

Beyond balancing the bandwidth between the SAN and the IP network through the GNBD server layer, another important issue in designing a data sharing cluster is the ratio between the number of GFS and GNBD servers. If we again focus on bandwidth, then the answer is straightforward assuming the system designer knows the approximate bandwidth requirements of the applications running on their GFS cluster (this is actually the hard part, because I/O characteristics of even simple programs are generally not well known). In this case you need to make sure there are enough GNBD servers to provide the bandwidth desired into the GFS data sharing cluster. For example, if 128 GFS server nodes require 10 megabytes per second each, and if each GNBD server can provide 80 megabytes per second from the SAN to the IP network, then each of 16 GNBD servers can handle eight GFS nodes.

Another important issue to consider is when to use GNBD over Ethernet instead of Fibre Channel on every server. For small data-sharing clusters (six nodes or fewer) where storage network performance is critical, Fibre Channel is a natural choice. For large clusters (sixteen or more nodes) where storage network performance is not absolutely critical, then either GNBD or iSCSI is probably the right answer. Even for smaller clusters, if performance and availability are not critical, then using a single GNBD server and exporting the local storage to the GFS data sharing cluster is a reasonable approach. In fact, for smaller clusters even NFS, with its overhead and non-POSIX-compliant behavior, may be acceptable. Refer to Red Hat GFS vs. NFS: Improving performance and scalability [link to article] for a comparision ofNFS and a data-sharing cluster approach with GFS and GNBD.

Summary

Combining Fibre Channel and Gigabit Ethernet for attaching nodes in the same data sharing cluster helps emphasize an important point: Fibre Channel and Gigabit Ethernet are complementary interfaces that can work together. Red Hat GNBD is a software approach that helps make that happen; iSCSI is another fine alternative for building low-cost, IP-based storage networks. SAN switches that integrate both iSCSI and Fibre Channel are a hardware alternative to achieving the same kind of integration, and as iSCSI matures and takes its place in the SAN infrastructure, Red Hat GFS can fully exploit it. Red Hat GFS provides data sharing in a cluster that can exploit both direct Fibre Channel connections as well as GNBD- or iSCSI-supported Gigabit Ethernet connections. Large data sharing Linux clusters can be constructed using these techniques and GFS in a way that allows systems architects to fine-tune their designs to achieve the best price performance for their applications. The approach exploits the low-cost and scalability of IP networks with the performance and storage-specific functionality of Fibre Channel.

About the author

From 1990 to May 2000, Matthew O'Keefe taught and performed research in storage systems and parallel simulation software as a professor of electrical and computer engineering at the University of Minnesota. He founded Sistina Software in May of 2000 to develop storage infrastructure software for Linux, including the Global File System (GFS) and the Linux Logical Volume Manager (LVM). Sistina was acquired by Red Hat in December 2003, where Matthew now directs storage software strategy.