[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

FW: [Linux-cluster] Re: If I have 5 GNBD server?

Ok, so I tried using heartbeat to create a virtual ip that floats between
the gnbd servers it worked out okay I can actually mount under that virtual
gnbd server ip. And If I wanted to manually fail out one gnbd server for
another no prob if all was done cleany, i.e. ( vgchange -aln, gnbd_import
-R) then would remount by gnbd_import -i gnbd_vip:

Which would give the unique gnbd in /dev/gnbd, then I'd run vgchange -aly,
would would bring it in lvm and device mapper.

However a manual failover test hangs with "vgchange -aln" as the old device
"unique gnbd" is still attempted to be accessed, also kill the process with
a killall or a kill -11, still doesn't cleanly allow clmvd to return to a
clean state.

As for Multipath as  Ben has wrote

> If the gnbds are exported uncached (the default), the client will fail
back IO
> if it can no longer talk to the server after a specified timeout.  However
> the userspace tools for dm-multipath are still to SCSI-centric to allow
> to run on top of gnbd.  You can manually run dm-setup commands to build
> the appropriate multipath map, scan the map to check if a path has failed,
> remove the failed gnbd from the map (so the device can close and gnbd can
> start trying to reconnect), and them manually add the gnbd device back
> the map when it has reconnected. That's pretty much all the dm-multipath
> userspace tools do.  Someone could even write a pretty simple daemon that
> this, and become the personal hero of many people on this list.

> The only problem is that if you manually execute the commands, or write
> daemon in bash or some other scripting language, you can run into a memory
> deadlock. If you are in a very low memory situation, and you need to
> gnbd IO requests to free up memory, the daemon can't allocate any memory
> doing it's job.
> If you have the gnbd exported in caching mode, each server will maintain
> own cache, So if you write a block to one server, and then the server
> when you read the block from the second server, if it was already cached
> before the read, you will get invalid data, so that won't work. If you
> set the gnbd to uncached mode, the client will fail the IO back, and
> (a multipath driver) needs to be there to reissue the request.

> -Ben

I have tried to get the dm-multipath setup working correctly but had little
success I had started a earlier thread on it and didn't get any response,

My test was based on 

I posted my issues here.


The initial command used was 
echo "0 1146990592  multipath 0 0 1 1 round-robin 0 2 1 251:1 1000 251:3
1000 " | dmsetup create dm-1

(251:1 & 251:3 are the major:minor ids of the gnbds obtained from the
command cat /proc/partitions)
(1146990592 -> I believe is the size of the block device.) 
This resulted in a block device of which I still could not mount, I tried
multipath -0ll (after installing multipath and create a rudimentary
multipath.conf) and the result was


[size=546 GB][features="0"][hwhandler="0"] \_ round-robin 0 [active][first]
  \_ 0:0:0:0      251:0   [undef ][active]
\_ round-robin 0 [enabled]
  \_ 0:0:0:0      251:4   [undef ][active]

"notice that the size was 1/2 the actual size!?! (I have no idea what this
means "somebody enlighten me, please!)

When I attempted to mount
[root dell-1650-31 ~]# mount -t gfs /dev/mapper/dm-1 /mnt/gfs1

mount: /dev/dm-1: can't read superblock

This was tried previously off a multipathed device in  which dmsetup status
gives the output below:

dm-1: 0 1146990592 multipath 1 0 0 2 1 A 0 1 0 251:1 A 0 E 0 1 0 251:3 A 0

dmsetup deps gives
dm-1: 2 dependencies    : (251, 3) (251, 1)

and dmsetup info gives
Name:              dm-1
State:             ACTIVE
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 1
Number of targets: 1

I have managed to get a nfs/gnbd failover type scenario working, in which
gnbd_servers export the shared storage via nfs and the clients mount via a
heartbeat VIP. I then created a script which I will rewrite into into a mini
daemon soon, that checks status of the servers then when the ip is token
over stops apache, unmounts nfs via the "umount -l -t nfs $mountpoint" then
"mount $mountpoint" & starts apache again. I have tested it and it works (
by checking for stale handles and remounting cleanly) Some of the same
principles can go into one for gnbd. But my bump right now is LVM.

Can GNBD be used without LVM? Or does anyone know how to enable failover
correctly on dm-multipath?

Any help would be appreciated.

Brian Urrutia 

-----Original Message-----
From: brian urrutia [mailto:brianu mail silvercash com] 
Sent: Monday, August 29, 2005 12:56 AM
To: linux-cluster redhat com
Cc: mikore li gmail com; brianu
Subject: Re: [Linux-cluster] Re: If I have 5 GNBD server?

> > If using LVM to make a volume of imported gnbds is not the answer for
> > redundancy can anyone suggest a method that is? Im not opposed to using
> > other resource of cluster or GFS but I would really like to implement a
> > redundant solution, ( gnbd, gulm, etc.). 
> > 
> Hi, Brianu, maybe LVM + md + gnbd should be one of the solution for
> redundancy, for example, you have 2 gnbd servers, each one exports 1
> disk. Then, the steps should be:
> 1. create a RAID-1  /dev/md0 on GFS client with imported 2 gnbd block
> 2. use LVM  create /dev/vg0 on top of them.
> 3. mkfs_gfs on /dev/vg0.
> I haven't tried this configuration, theoretically, it should work.
> Thanks,
> Michael

I will look into trying a md & lvm combo, as far as keepalived or
rgmanager to failover an ip, i havent seen a clear example on how to use
rgmanager, but i have tried heartbeat (linux HA) to failover the ip, and
the problem is that the gnbd clients still seem to lock on the former
server regardless of that the ip has failed over to another ip ( and
continuly try and reconnect as Fajar had mentioned).

The shared storage I have is a HP MSA 100 SAN
It might be a config error on my part as far as rgmanager is concerned i
will have to post my cluster.conf tommorrow.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]