[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [Linux-cluster] dlm and IO speed problem <er, might wanna get a coffee first ; )>
- From: Kadlecsik Jozsef <kadlec sunserv kfki hu>
- To: linux clustering <linux-cluster redhat com>
- Subject: Re: [Linux-cluster] dlm and IO speed problem <er, might wanna get a coffee first ; )>
- Date: Fri, 11 Apr 2008 13:05:08 +0200 (CEST)
On Thu, 10 Apr 2008, Kadlecsik Jozsef wrote:
> But this is a good clue to what might bite us most! Our GFS cluster is an
> almost mail-only cluster for users with Maildir. When the users experience
> temporary hangups for several seconds (even when writing a new mail), it
> might be due to the concurrent scanning for a new mail on one node by the
> MUA and the delivery to the Maildir in another node by the MTA.
>
> What is really strange (and distrurbing) that such "hangups" can take
> 10-20 seconds which is just too much for the users.
Yesterday we started to monitor the number of locks/held locks on two of
the machines. The results from the first day can be found at
http://www.kfki.hu/~kadlec/gfs/.
It looks as Maildir is definitely a wrong choice for GFS and we should
consider to convert to mailbox format: at least I cannot explain the
spikes in another way.
> In order to look at the possible tuning options and the side effects, I
> list what I have learned so far:
>
> - Increasing glock_purge (percent, default 0) helps to trim back the
> unused glocks by gfs_scand itself. Otherwise glocks can accumulate and
> gfs_scand eats more and more time at scanning the larger and
> larger table of glocks.
> - gfs_scand wakes up every scand_secs (default 5s) to scan the glocks,
> looking for work to do. By increasing scand_secs one can lessen the load
> produced by gfs_scand, but it'll hurt because flushing data can be
> delayed.
> - Decreasing demote_secs (seconds, default 300) helps to flush cached data
> more often by moving write locks into less restricted states. Flushing
> often helps to avoid burstiness *and* to prolong another nodes'
> lock access. Question is, what are the side effects of small
> demote_secs values? (Probably there is no much point to choose
> smaller demote_secs value than scand_secs.)
>
> Currently we are running with 'glock_purge = 20' and 'demote_secs = 30'.
Best regards,
Jozsef
--
E-mail : kadlec mail kfki hu, kadlec blackhole kfki hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]