[Linux-cluster] cluster architecture/filesystem suggestions wanted

Mon Aug 10 16:16:13 UTC 2009

Hello,

I've been testing RHCS (CentOS 5.2) cluster with GFS1 for a while, and
I'm about to transition the cluster to production, and I'd appreciate
a quick review of the architecture and filesystem choices. I've got some 
concerns about GFS (1 & 2) stability and performance vrs. ext3fs, but the 
increased flexibility in a clustered filesystem has a lot of advantages.

If there are fundamental stability advantages to a design that does not
cluster the filesystems (ie., that uses GFS in lock_nolock mode or ext3fs), 
that would override any performance consideration.

Assuming that stability is not an issue, my basic question in terms of choosing
an architecture is whether there is better performance through using GFS with
multiple cluster nodes (gaining some CPU and network load balancing at the cost
of the GFS locking performance penalty) accessing the same data, or whether
serving each volume from a single server via NFS (using RCHS solely for
fail-over) is more efficient? Obviously, I don't expect anyone to provide
definitive answers or data that's unique to our environment, but I'd highly
appreciate your view on the architecture choices.

Background:
    Our lab does basic science research on software to process medical
    images. There are about 40 lab members, with about 15~25 logged in
    at any given time. Most people will be logged into multiple servers
    at once, with their home directory and all data directories provided
    via NFS at this time.

    The workload is divided between a software development environment
    (compile/test cycles) and image processing. The software development
    process is interactive, and includes algorithm testing which requires
    reading/writing multi-MB files.  There's a reasonably high performance
    expectation for interactive work, less so for the testing phase.

    Many lab members also SAMBA mount filesystems from the servers
    to their desktop machines, for which there is a high performance
    expectation.

    The image processing is very strongly CPU-bound, but involves reading
    many image files in the 1 to 50MB range, and writing results files
    in the same range, along with smaller index and metadata files. The
    image processing is largely non-interactive, so the I/O performance
    is not critical.

The RHCS cluster will be used for infrastructure services (not as a
compute resource for image processing, not as login servers, not as
compilation servers). The primary services to be run on the clustered
machines are:

	network file sharing (NFS, Samba)
	SVN repository
	backup server (bacula, to fibre-attached tape drive)
	Wiki
	nagios

None of those services require a lot of CPU. The network file sharing
could benefit from load balancing, so that the NFS and SAMBA clients have
multiple network paths to the storage, but the NFS and SAMBA protocols
are not well suited for using RHCS as a load balancer, so this may not
be possible (using LVS or a front-end hardware load balancer is not an
option at this time...HA is much more important than load balancing).

The goals of using RHCS and clustering those functions are (in order of
importance):

	stability of applications
	high availability of applications
	performance
	expandability of filesystems (ie., expand volumes at the SAN, LUN,
		LVM, and filesystem layers)
	expandability of servers (add more servers to the cluster, with
		machines dedicated to functions, as a crude form of load
		balancing)

The computing environment consists of:
	2 RHCS servers
		fibre attached to storage and backup tape device

	~15TB EMC fibre-attached storage
	~14TB fibre and iSCSI attached storage in the near future

	4 compute servers
		currently accessing storage via NFS, could be
		fibre-attached and configured as cluster members

	35 compute servers
		NFS-only access to storage, possibly iSCSI in the
		future, no chance of fibre attachment

As I see it, there are 3 possible architecture choices:

	[1] infrastructure only-GFS+NFS
		the 2 cluster nodes share storage via GFS, and
		act as NFS servers to all compute servers

		+ load balancing of some services
		- complexity of GFS
		- performance of shared GFS storage

	[2] shared storage/NFS
		2 cluster nodes and 4 fibre-attached compute servers
		share storage via GFS (all machines are RHCS nodes, but
		the compute nodes do not provide infrastructure services,
		just use cluster membership for GFS file access)

		each GFS node is potentially an NFS server (via a VIP) to
		the 35 compute servers that are not on the fibre SAN

		+ potentially faster access to data for 4 fibre-attached
		  compute servers

		- potentially slow accesss to data for 4 fibre-attached
		  compute servers due to GFS locking

		+ increased stability over 2 node cluster
		- increased complexity

	[3] exclusive storage/NFS
		filesystems are formatted as ext3fs, exclusively mounted
		to one of the 2 infrastructure cluster nodes at a time,
		each filesystem mount also includes a child (dependent)
		function for the node to be an NFS server, all compute nodes
		access data via NFS

		+ reliability of filesystem
		+ performance of filesystem
		- potential for corruption in case of non-exclusive access
		- decreased flexibility due to exclusive use
		- no potential for load balancing across cluster nodes

I'm very interested in getting your opinion of the choices, and would like
to learn about other ideas that I may have overlooked.

Thanks,

Mark

----
Mark Bergman                              voice: 215-662-7310
mark.bergman at uphs.upenn.edu                 fax: 215-614-0266
System Administrator     Section of Biomedical Image Analysis
Department of Radiology            University of Pennsylvania
      PGP Key: https://www.rad.upenn.edu/sbia/bergman