[Linux-cluster] scsi reservation issue

Fri Nov 9 06:45:27 UTC 2007

On Thu, 2007-11-08 at 15:32 -0600, Ryan O'Hara wrote:
> Christopher Barry wrote:
> > 
> > Okay. I had some other issues to deal with, but now I'm back to this,
> > and let me get you all up to speed on what I have done, and what I do
> > not understand about all of this.
> > 
> > status:
> > esx-01: contains nodes 1 thru 3
> > esx-02: contains nodes 4 thru 6
> > 
> > esx-01: all 3 cluster nodes can mount gfs.
> > 
> > esx-02: none can mount gfs.
> > esx-02: scsi reservation errors in dmesg
> > esx-02: mount fails w/ "can't read superblock" 
> 
> OK. So it looks like one of the nodes is still holding a reservation on 
> the device. First, we need to determine which node has that reservation. 
>   From any node, you should be able to run the following commands:
> 
> sg_persist -i -k /dev/sdc
> sg_persist -i -r /dev/sdc
> 
> The first will list all the keys registered with the device. The second 
> will show you which key is holding the reservation. At this point, I 
> would expect that you will only see 1 key registered and that key will 
> also be the reservation holder, but that it just a guess.
> 
> The keys are unique to each node, so we can figure correlate a key to a 
> node. The key is just the hex representation of the node's IP address. 
> You can get this by running gethostip -x <hostname>. By doing this, you 
> should be able to figure out which node is still holding a reservation.
> Once you determine this key/node, try running /etc/init.d/scsi_reserve 
> stop from that node. Once that runs, use the sg_persist commands listed 
> above to see if the reservation is cleared.
> 
> > Oddly, with the gfs filesystem unmounted on all nodes, I can format the
> > gfs filesystem from the esx-02 box (from node4), and then mount it from
> > a node on esx-01, but cannot mount it on the node I just formatted it
> > from!
> > 
> > fdisk -l shows /dev/sdc1 on nodes 4 thru 6 just fine.
> 
> Hmm. I wonder if there is something goofy happening because the nodes 
> are running within vmware. I have never tried this, so I have no idea. 
> Either way, we should be able to clear up the problem.
> 
> > # sg_persist -C --out /dev/sdc1
> > fails to clear out the reservations
> 
> Right. It believe this must be run from the node holding the 
> reservation, or at the very least a node that is registered with the 
> device. Also node that scsi reservations effect the entire LUN, so you 
> can't issue registrations/reservations to a single partition (ie. sdc1).
> 
> > I do not understand these reservations, maybe someone can summarize?
> 
> I'll try to be brief. Each node in the cluster can register with a 
> device, thus a device may have many registrations. Each node registers 
> by using a unique key. Once registered, one of the nodes can issue a 
> reservation. Only one node may hold the reservation, the reservations is 
> created using that node's key. For our purposed, we use a 
> write-exclusive, registrants only type of reservation. This means that 
> only nodes that are registered with the device may write to it. As long 
> as that reservation exists, that rule will be enforced.
> 
> When it comes to to remove registrations, there it one caveat: the node 
> that hold the reservation cannot unregister unless there are no other 
> nodes registered with the device. This is due to the fact that the 
> reservations holder must also be registered  *and* if the reservation 
> were to go away the write-exclusive, registrants-only policy would not 
> longer be in effect. So ... what may have happened is that you tried to 
> clear the reservation while other nodes were still registered, which 
> will fail since that cannot happen. Once all the other nodes have 
> "unregistered", you should be able to go back and clear the reservation.
> 
> Yes, this is a limitation in our product. There is a notion of moving a 
> reservation (in the case where the reservation holder wants to 
> unregister), but that is not yet implemented.
> 
> > I'm not at the box this sec (vpn-ing in will hork my evolution), but I
> > will provide any amount of data if either you Ryan, or anyone else has
> > stuff for me to try.
> 
> Please let me know if you have questions or need further assistance 
> clearing that pesky reservation for you. :)
> 
> > Thanks all,
> > -C
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

Ryan,

Thank you so much for your replies.

I tracked down the registration and reserve to the first cluster node,
by converting the hex value to the IP per your instructions. All nodes
reported only this one registration.

On that node, I try:
# sg_persist -C --out /dev/sdc

and it returns a failure, citing a scsi reservation conflict.

I then try on kop-sds-01, the node holding the reservation:

#/etc/init.d/scsi_reserve stop
  connect() failed on local socket: Connection refused
  No volume groups found

Now, I initially had clvmd running, and I had volume groups defined, but
since I'm running on a netapp that does all of that stuff, I decided to
simplify it and remove this stuff. I removed all that in the beginning,
after initially trying to troubleshoot this problem. Are these
reservations somehow stuck looking at an old lvm configuration
somewhere?

Thanks!

-C

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GWAVADAT.TXT
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071109/7e0e270c/attachment.ksh>