[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Unformatting a GFS cluster disk



christopher barry wrote:
On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote:
A fun feature is that the multiple snapshots of a file have the identical
inode value


I can't fault this statement but would prefer to think snapshots as different trees with their own root inodes (check "Figure 3" of: "Network Appliance: File System Design for an NFS File Server Appliance" - the pdf paper can be downloaded from: http://en.wikipedia.org/wiki/Write_Anywhere_File_Layout; scroll down to the bottom of the page).

In the SAN environment, I also like to think multiple snapshots as different trees that may share same disk blocks for faster backup (no write) and less disk space consumption, but each with its own root inode. Upon recovery time, the (different) trees can be exported and seen by linux host as different lun(s). The detailed internal could be quite tedious and I'm not in the position to describe it here.

So, I'm trying to understand what to takeaway from this thread:
* I should not use them?
* I can use them, but having multiple snapshots introduces a risk that a
snap-restore could wipe files completely by potentially putting a
deleted file on top of a new file?

Isn't that a "restore" is supposed to do ? Knowing this caveat without being told, you don't look like an admin who will make this mistake ..
* I should use them - but not use multiples.
* something completely different ;)

Our primary goal here is to use snapshots to enable us to backup to tape
from the snapshot over FC - and not have to pull a massive amount of
data over GbE nfs through our NAT director from one of our cluster nodes
to put it on tape. We have thought about a dedicated GbE backup network,
but would rather use the 4Gb FC fabric we've got.

Check Netapp NOW web site (http://now.netapp.com - accessible by its customers) to see whether other folks have good tips about this. Just did a quick search and found a document titled "Linux Snapshot Records and LUN Resizing in a SAN Environment". It is a little bit out of date (dated on 1/27/2003 with RHEL 2.1) but still very usable in ext3 environment.

In general, GFS backup from Linux side during run time has been a pain, mostly because of its slowness and the process has to walk thru the whole filesystem to read every single file that ends up accumulating non-trivial amount of cached glocks and memory. For a sizable filesystem (say in TBs range like yours), past experiences have shown that after backup(s), the filesystem latency can go up to an unacceptable level unless its glocks are trimmed. There is a tunable specifically written for this purpose (glock_purge - introduced via RHEL 4.5 ) though.

The problem can certainly be helped by the snapshot functions embedded in Netapp SAN box. However, if tape (done from linux host ?) is preferred as you described due to space consideration, you may want to take a (filer) snapshot instance and do a (filer) "lun clone" to it. It is then followed by a gfs mount as a separate gfs filesystem (this is more involved than people would expect, more on this later). After that, the tape backup can take place without interfering with the original gfs filesystem on the linux host. On the filer end, copy-on-write will fork disk blocks as soon as new write requests come in, with and without the tape backup activities.

The thinking here is to leverage the embedded Netapp copy-on-write feature to speed up the backup process with reasonable disk space requirement. The snapshot volume and the cloned lun shouldn't take much disk space and we can turn on gfs readahead and glock_purge tunables with minimum interruptions to the original gfs volume. The caveat here is GFS-mounting the cloned lun - for one, gfs itself at this moment doesn't allow mounting of multiple devices that have the same filesystem identifiers (the -t value you use during mkfs time e.g. "cluster-name:filesystem-name") on the same node - but it can be fixed (by rewriting the filesystem ID and lock protocol - I will start to test out the described backup script and a gfs kernel patch next week). Also as any tape backup from linux host, you should not expect an image of gfs mountable device (when retrieving from tape) - it is basically a collection of all files residing on the gfs filesystem when the backup events take places.

Will the above serve your need ? Maybe other folks have (other) better ideas ?

BTW, the described procedure is not well tested out yet, and more importantly, any statement in this email does not represent my ex-employer, nor my current employer's official recommendations.

-- Wendy








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]