[Linux-cluster] GFS profiling result

David Teigland teigland at redhat.com
Wed Sep 12 20:45:25 UTC 2007


On Thu, Sep 06, 2007 at 10:58:40AM +0200, Mark Hlawatschek wrote:
> Hi,
> 
> during a performance analysis and tuning session, I did some profiling with 
> oprofile on GFS and dlm. 
> I got some weird results ... 
> 
> The installed software is:
> RHEL4u5, kernel 2.6.9-55.0.2.ELsmp
> GFS:  2.6.9-72.2.0.2
> DLM: 2.6.9-46.16.0.1
> 
> The configuration includes 2 clusternodes.
> 
> I put the following load on one cluster node:
> 
> 100 processes are doing in parallel: 
> - create 1000 files with 100kb size each (ie altogether we have 100.000 files)
> - flock 1000 files
> - unlink 1000 files.
> 
> The following oprofile output shows, that the system spends about 49% 
> (75%*65%*) of the time in gfs_unlinked_get.
> Looking into the code whe can see, that this is related to unlinked.c:
>      53 9394211 58.7081 :                       ul = list_entry(tmp, struct 
> gfs_unlinked, ul_list);
> 
> It can also be observed, that dlm spends more than 50% of its time in 
> searching for hashes...
> 
> Is this the expected behaviour or can this be tuned somewhere ?

Thanks for doing this, it's very interesting.  For the dlm
search_hashchain, could you try changing rsbtbl_size to 1024 (the default
is 256).  echo 1024 > /proc/.../rsbtbl_size after loading the dlm module,
but before the lockspace is created.

For gfs, I haven't looked very closely, but the linked list could probably
be simply turned into a hash table.  We'd want to study it more closely to
make sure that the long non-hashed list is really the right thing to fix
(i.e. we don't want to just fix a symptom of something else).

Dave




More information about the Linux-cluster mailing list