[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] test hung after 36 hours



On Mon, Apr 11, 2005 at 05:13:06PM -0700, Daniel McNeil wrote:
> I started my mount/tar/rm/ tests on Apr  4 17:41 and I hit
> a problem at Apr  6 05:30.  So the test ran for 36 hours.
> cl030 and cl031 were getting "SM: process_reply invalid"
> messages and cl032 got "No response" and "Missed too many
> heartbeats"

The SM messages are an effect of CMAN removing nodes.  There's a fair
chance that this recent fix will help:
http://sources.redhat.com/ml/cluster-cvs/2005-q2/msg00018.html

-- 
Dave Teigland  <teigland redhat com>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]