[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] gfs2, kvm setup



Hi,

On Tue, 2008-07-08 at 18:15 -0400, J. Bruce Fields wrote:
> On Mon, Jul 07, 2008 at 02:49:28PM -0400, bfields wrote:
> > On Mon, Jul 07, 2008 at 10:48:28AM -0500, David Teigland wrote:
> > > On Sun, Jul 06, 2008 at 05:51:05PM -0400, J. Bruce Fields wrote:
> > > > -	write(control_fd, in, sizeof(struct gdlm_plock_info));
> > > > +	write(control_fd, in, sizeof(struct dlm_plock_info));
> > > 
> > > Gah, sorry, I keep fixing that and it keeps reappearing.
> > > 
> > > 
> > > > Jul  1 14:06:42 piglet2 kernel: dlm: connect from non cluster node
> > > 
> > > > It looks like dlm_new_workspace() is waiting on dlm_recoverd, which is
> > > > in "D" state in dlm_rcom_status(), so I guess the second node isn't
> > > > getting some dlm reply it expects?
> > > 
> > > dlm inter-node communication is not working here for some reason.  There
> > > must be something unusual with the way the network is configured on the
> > > nodes, and/or a problem with the way the cluster code is applying the
> > > network config to the dlm.
> > > 
> > > Ah, I just remembered what this sounds like; we see this kind of thing
> > > when a network interface has multiple IP addresses, and/or routing is
> > > configured strangely.  Others cc'ed could offer better details on exactly
> > > what to look for.
> > 
> > OK, thanks!  I'm trying to run gfs2 on 4 kvm machines, I'm an expert on
> > neither, and it's entirely likely there's some obvious misconfiguration.
> > On the kvm host there are 4 virtual interfaces bridged together:
> 
> I ran wireshark on vnet0 while doing the second mount; what I saw was
> the second machine opened a tcp connection to port 21064 on the first
> (which had already completed the mount), and sent it a single message
> identified by wireshark as "DLM3" protocol, type recovery command:
> status command.  It got back an ACK then a RST.
> 
> Then the same happened in the other direction, with the first machine
> sending a similar message to port 21064 on the second, which then reset
> the connection.
> 
> --b.
> 
An ACK & RST for the same packet? Or was than an ACK SYN for the SYN and
then an RST for the following data packet? Could you post the trace or
put it somewhere we can see it?

Steve.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]