[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Locking and performance questions regarding GFS1/2

Hi Steven,

Thank you for your answer,

Le Mon, 14 Jan 2008 14:18:54 +0000,
Steven Whitehouse <swhiteho redhat com> a écrit :

> Basically yes. It reads all the RGs, although in the allocation case
> it doesn't need to read all the RGs to work out where to put newly
> allocated blocks, it only needs to read some of them. That also needs
> to be fixed at some stage in the future.

If i trust the code (and i _do_ trust the code :-) )

cluster-1.03, rgrp.c , L1193,
static int
get_local_rgrp(struct gfs_inode *ip)
	for (;;) {
		error = gfs_glock_nq_init(rgd->rd_gl,
					  LM_ST_EXCLUSIVE, flags,
		switch (error) {
		case 0:
			if (try_rgrp_fit(rgd, al))
				goto out;

When the FS is freshly formated, we should return quite instantly from
this function; however, that's not the behaviour i observed. What am i
getting wrong ?

> RGs are limited to 2^32 blocks, including the RG header. Generally you
> want to use a number or RGs >> number of nodes. Provided this is true
> then you can make the RGs as large as you like (up to the 2^32 block
> limit) without compromising performance.

Then i guess the attached patch could help having RG sized more than
The patch is not complete: the 2^32 block limit is not expressed
correctly, although it proved to work on a cluster of 6  64bits nodes,
with RG sizes of 4G and 8G and 16G on a 1T FS. In those cases, i got 256
and 128 and 64  RGs. Are those numbers respecting the constraint :
"RGs >> number of nodes,"
as you expressed it ?
(I guess it depends on the usage of the FS : lots
of cross-accesses from one node to other nodes' file , or at the other
side all nodes working mostly on their own files/directories ?)

> Steve.


Les opinions et prises de position emises par le signataire du present
message lui sont propres et ne sauraient engager la responsabilite de la
societe SEANODES.

Ce message ainsi que les eventuelles pieces jointes constituent une
correspondance privee et confidentielle a l'attention exclusive du
destinataire designe ci-dessus. Si vous n'etes pas le destinataire du
present message ou une personne susceptible de pouvoir le lui delivrer, il
vous est signifie que toute divulgation, distribution ou copie de cette
transmission est strictement interdite. Si vous avez recu ce message par
erreur, nous vous remercions d'en informer l'expediteur par telephone ou de
lui retourner le present message, puis d'effacer immediatement ce message de
votre systeme.

The views and opinions expressed by the author of this message are personal.
SEANODES shall assume no liability, express or implied for such message.

This e-mail and any attachments is a confidential correspondence intended
only for use of the individual or entity named above. If you are not the
intended recipient or the agent responsible for delivering the message to
the intended recipient, you are hereby notified that any disclosure,
distribution or copying of this communication is strictly prohibited. If you
have received this communication in error, please notify the sender by phone
or by replying this message, and then delete this message from your system. 

Index: cluster/gfs/gfs_mkfs/main.c
--- cluster/gfs/gfs_mkfs/main.c	()
+++ cluster/gfs/gfs_mkfs/main.c	(copie de travail)
@@ -333,12 +333,12 @@
   if (comline.expert)
-    if (1 > comline.rgsize || comline.rgsize > 2048)
+    if (1 > comline.rgsize)
       die("bad resource group size\n");
-    if (32 > comline.rgsize || comline.rgsize > 2048)
+    if (32 > comline.rgsize)
       die("bad resource group size\n");
Index: cluster/gfs/gfs_mkfs/mkfs_gfs.h
--- cluster/gfs/gfs_mkfs/mkfs_gfs.h	()
+++ cluster/gfs/gfs_mkfs/mkfs_gfs.h	(copie de travail)
@@ -101,7 +101,7 @@
   uint32 seg_size;          /*  The journal segment size  */
   uint32 journals;          /*  Number of journals  */
   uint32 jsize;             /*  Size of journals  */
-  uint32 rgsize;            /*  The Resource Group size  */
+  uint64 rgsize;            /*  The Resource Group size  */
   int debug;                /*  Print out debugging information?  */
   int quiet;                /*  No messages  */

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]