[Linux-cluster] iSCSI GFS

Mon Jan 28 17:03:28 UTC 2008

On Mon, 28 Jan 2008, isplist at logicore.net wrote:

>> How much I/O do you actually need?
>
> My concern is not so much what I need up front but the growth path should I
> start needing to add a lot of storage. LAMP services can sometimes grow very
> quickly, immediately need endless ongoing storage space for uploaded media and
> playback, not to mention the web services themselves.

Yes, but we are talking 10s od Gb here before you may need to alter your 
approach.

> Like I mentioned, I've seen not being prepared for growth and it's not fun. I
> would hate to have to keep changing out technologies once things get going
> because I didn't choose a nice flexible solution up front.
>
>> creates virtual software RAID stripe over them. It then exports this
>> back out via iSCSI. All the client nodes then connect to the single big
>> iSCSI node that is the aggregator.
>
> Do you know of any software which helps to keep track of all this? This is an
> interesting idea. I think I understand it and want to give it a try. I have
> various types of storage where this would be a good solution.

It's pretty simple to set up. You just need to be familliar with iSCSI 
tools and software RAID tools, all of which are almost certainly in your 
distro's apt/yum repositories.

> Let's see if I've got this right.
>
> I need a machine which will become the aggregator, plenty of memory,
> multi-port Ethernet card and of course an FC HBA.
> FC storage will be attached to this machine. Then, iSCSI storage targets will
> also export to this machine.

Not quite sure I follow this - you want to use FC storage and combine it 
with iSCSI storage into a bigger iSCSI storage pool? No reasib why not, I 
suppose.

> This machine will then use virtual RAID (which I've no idea about yet) and
> aggregate the storage as a volume or how many I need. Next I export this to
> the servers via iSCSI for their larger ongoing storage needs.

Pretty much. Once you have the big software RAID stripe, you can use this 
to back any number of iSCSI volumes.

Note that software RAID only goes up to RAID 6 (i.e. n+2). So you cannot 
lose more than 2 nodes (FC or iSCSI), otherwise you lose your data.

> Now I can use say a small FC GFS target to share various files such as web
> pages and other web based shared files yet have the larger more easily
> expandable V-RAID for media and other such things.

Once you have a big RAID-ed storage pool, you can partition it out in 
whatever way you like. You also don't have to put it all into one big RAID 
stripe, but a few smaller ones.

You can dynamically add disks to a software RAID stripe.

> This means that I also need to find an iSCSI target driver that doesn't take
> rocket science to figure out and is open source to keep things cheap while
> trying this out.

yum install iscsi-target
:-)

>> NFS can give considerably better performance than GFS under some
>> circumstances. If you don't need POSIX compliant file locking, you may
>
> As I've put all of this together, I've come to learn that I need GFS for the
> shared data but the rest, I don't need anything but standard storage. I'll
> maintain a cluster of web servers which use GFS to share their pages/images
> but I could use this aggregator idea for the larger scale media storage.

Sure, but you aggregator could export space via iSCSI or via NFS, 
whichever you prefer.

> I had somehow gotten a little too caught up in GFS and I was basically not
> thinking anything beyond that. This makes things so much simpler than where I
> was heading.

It happens to the best of us. :)

>> I suspect that possibly overkill. It's NIC I/O you'll need more than
>> anything. Jumbo frames, as big as your hardware can handle, will also help.
>
> Doesn't NIC I/O take up a lot of CPU time?

Not really. Topping out your CPU with NIC I/O load isn't all that trivial. 
There are also NICs that can offload the entire TCP/IP stack off the CPU 
onto the NIC, but I don't know that the driver support for Linux is like.

>> There is no reason why the aggregator couldn't mind it's own exports,
>> and run as one of the client cluster nodes.
>
> I just mean for fail over. It might be nice to have a little redundancy there.

Sure, you could set up heartbeat and preferably some fencing. I suspect 
that double activating a software RAID stripe would be quite destructive 
for your data (I asked about that in a different thread, but nobody 
stepped up to clarify yet), so fencing is a good idea, "just to make 
sure". When a node needs to take over, it fences the other node, connects 
the iSCSI shares, starts up the RAID on them, assumes the floating IP and 
exports the iSCSI/NFS shares.

Gordan