[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
RE: [Linux-cluster] RE: Errors trying to login to LT000: ...1006:Not Allowed
- From: "Treece, Britt" <Britt Treece savvis net>
- To: "linux clustering" <linux-cluster redhat com>
- Subject: RE: [Linux-cluster] RE: Errors trying to login to LT000: ...1006:Not Allowed
- Date: Wed, 7 Mar 2007 09:17:17 -0600
Title: Re: [Linux-cluster] RE: Errors trying to login to LT000: ... 1006:Not Allowed
Does anyone have any idea why incorrect entries in
/etc/hosts of the lock servers would intermittently cause the "Errors trying to
login to LT000: ...1006:Not Allowed?" I would think this would be
something that if wrong should *consistently* cause the client not to
be allowed into the lockspace.
Additionally can anyone explain the fundamentals of GFS 6.0
lock tables and the locking process. A couple specific questions I
have...
What is the difference between LTPX and
the LT000?
What is the advantage of having
additional lock tables and when would having more be a
disadvantage?
Is each lock propagated to each
locktable or is it held in only one table?
Is the highwater mark for each locktable
or the sum of locks across all locktables?
Regards,
Britt Treece
Not sure why my first post didn’t, but here it
is...
---
I am running a 13 node GFS (6.0.2.33) cluster with 10 mounting
clients and 3 dedicated lock servers. The master lock server was rebooted
and the next slave in the voting order took over. At that time 3 of the
client nodes started receiving login errors for the ltpx server
Mar 4
00:05:52 lock1 lock_gulmd_core[3798]: Master Node Is Logging Out
NOW!
...
Mar 4 00:05:52 lock2
lock_gulmd_core[24627]: Master Node has logged out.
Mar 4
00:05:54 lock2 lock_gulmd_core[24627]: I see no Masters, So I am Arbitrating
until enough Slaves talk to me.
Mar 4 00:05:54 lock2 lock_gulmd_LTPX[24638]:
New Master at lock2 :192.168.1.3
Mar 4 00:05:56 lock2 lock_gulmd_core[24627]:
Now have Slave quorum, going full Master.
Mar 4
00:11:39 lock2 lock_gulmd_core[24627]: Master Node Is Logging Out
NOW!
…
Mar 4 00:05:52 client1 kernel: lock_gulm: Checking for journals
for node "lock1 "
Mar 4 00:05:52 client1 lock_gulmd_core[9383]: Master Node has
logged out.
Mar 4 00:05:52 client1 kernel: lock_gulm: Checking for journals
for node "lock1 "
Mar 4 00:05:56 client1 lock_gulmd_core[9383]: Found Master at
lock2 , so I'm a Client.
Mar 4 00:05:56 client1 lock_gulmd_core[9383]:
Failed to receive a timely heartbeat reply from Master. (t:1172988356370685
mb:1)
Mar 4 00:05:56 client1 lock_gulmd_LTPX[9390]: New Master at
lock2 :192.168.1.3
Mar 4 00:06:01 client1 lock_gulmd_LTPX[9390]:
Errors trying to login to LT002: (lock2 :192.168.1.3) 1006:Not
Allowed
Mar 4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to
login to LT000: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4
00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT000: (lock2
:192.168.1.3) 1006:Not Allowed
Mar 4 00:06:02 client1 lock_gulmd_LTPX[9390]:
Errors trying to login to LT002: (lock2 :192.168.1.3) 1006:Not
Allowed
Mar 4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to
login to LT004: (lock2 :192.168.1.3) 1006:Not Allowed
Mar 4
00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to LT001: (lock2
:192.168.1.3) 1006:Not Allowed
---
Britt
On 3/5/07 10:30 PM, "Treece, Britt"
<Britt Treece savvis net> wrote:
All,
After much further
investigation I found /etc/hosts is off by one for these 3 client nodes on all
3 lock servers. Having fixed the typo's is it safe to assume that the
root of the problem trying to login to LTPX is that /etc/hosts on the lock
servers was wrong for these nodes? If yes, why would these 3 clients be
allowed into the cluster when it was originally started being that they had
incorrect entries in /etc/hosts?
Regards,
Britt
Treece
--
Linux-cluster mailing
list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]