[Linux-cluster] Node lag

Thu Feb 9 14:41:24 UTC 2006

Also, I think it might be interesting to see what happens when you use data
sizes that
will overrun any cacheing being done. I've seen great performance using a
simple MSA1000
as long as there is a lot of cache available on the SAN itself. As soon as I
run tests with
data sets larger then the cache size, the performance falls to the floor.
Unless your over
loading the cache, you might not be getting a true metric of whats really
getting written 
to disk.

Maybe the slow node is getting hit by cache overhead from the SAN? 

Just a thought

Corey

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patrick Caulfield
Sent: Thursday, February 09, 2006 9:18 AM
To: linux clustering
Subject: Re: [Linux-cluster] Node lag

Frank Schliefer wrote:
> Hi,
> 
> after setting up an four node cluster we have one node that is way 
> slower than the other 3 nodes.
> 
> We using eg. tiotest for benchmarking the GFS.
> 
> Normal Node:
> Tiotest results for 4 concurrent io threads:
> ,----------------------------------------------------------------------.
> | Item                  | Time     | Rate         | Usr CPU  | Sys CPU |
> +-----------------------+----------+--------------+----------+---------+
> | Write          40 MBs |    0.2 s | 227.426 MB/s |  36.4 %  | 384.4 % |
> | Random Write   16 MBs |    0.1 s | 143.405 MB/s |  58.7 %  | 146.9 % |
> | Read           40 MBs |    0.0 s | 2558.199 MB/s | 307.0 %  | 1228.0 % |
> | Random Read    16 MBs |    0.0 s | 2685.169 MB/s | 550.0 %  | 1374.9 % |
> `----------------------------------------------------------------------'
> 
> 
> Slow Node:
> Tiotest results for 4 concurrent io threads:
> ,----------------------------------------------------------------------.
> | Item                  | Time     | Rate         | Usr CPU  | Sys CPU |
> +-----------------------+----------+--------------+----------+---------+
> | Write          40 MBs |    1.4 s |  27.687 MB/s |   2.2 %  | 121.8 % |
> | Random Write   16 MBs |    4.2 s |   3.695 MB/s |   0.0 %  |   7.9 % |
> | Read           40 MBs |    0.0 s | 2228.288 MB/s |  89.1 %  | 1337.1 % |
> | Random Read    16 MBs |    0.0 s | 2252.739 MB/s | 230.7 %  | 692.1 % |
> `----------------------------------------------------------------------'
> 
> any hints why this could happen ??
> 
> Using kernel 2.6.15.2 (sorry no RH)

It would be helpful if you could give us more information about your
installation: disk topology, lock manager in use (and which nodes are
lockservers if using GULM) and whether it matters which nodes are started
first or not.

-- 

patrick

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster