[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Re: mkfs.gfs2 issue...



>>> Nick Couchman 03/23/07 11:57 AM >>>
I'm currently evaluating GFS2 for some clustering that I want to do, and I've run into a little problem.  I'm using kernel 2.6.20.3 (with GFS2 included) and the GFS2 userspace stuff from the RH Cluster page (sourceware.org/cluster).  I've tried both the "official" 2.0.0 release and the latest CVS version, and both exhibit the same behavior: when I try to make a GFS2 filesystem, mkfs.gfs2 just hangs.  I'm doing this on a 1 GB iSCSI volume, and the host has already transfered 6.9GB of data.  What in the world is it doing?!  If I enable debug output, I get the following:

Command Line Arguments:
  qcsize = 1
  jsize = 32
  journals = 2
  override = 0
  proto = lock_dlm
  quiet = 0
  rgsize = optimize for best performance
  table = fstest:testfs
  utsize = 1
  device = /dev/sdb1
This will destroy any data on /dev/sdb1.
  It appears to contain a ext3 filesystem.

Are you sure you want to proceed? [y/n] y


Partition size = 1955808

Device Geometry:  (in basic blocks)
  SubDevice #0: start = 0, length = 1955808, rgf_flags = 0x00000000

Device Geometry:  (in FS blocks)
  SubDevice #0: start = 0, length = 244476, rgf_flags = 0x00000000

Device Size: 244476

Data Subdevice 0
  rg sz = 256
  nrgrp = 4

subdevice 0:  rg_o = 17, rg_l = 61117
subdevice 0:  rg_o = 61134, rg_l = 61114
subdevice 0:  rg_o = 122248, rg_l = 61114
subdevice 0:  rg_o = 183362, rg_l = 61114

  ri_addr:: 17  ri_length:: 4  ri_data0:: 21  ri_data:: 61112  ri_bitbytes:: 15278
  ri_addr:: 61134  ri_length:: 4  ri_data0:: 61138  ri_data:: 61108  ri_bitbytes:: 15277
  ri_addr:: 122248  ri_length:: 4  ri_data0:: 122252  ri_data:: 61108  ri_bitbytes:: 15277
  ri_addr:: 183362  ri_length:: 4  ri_data0:: 183366  ri_data:: 61108  ri_bitbytes:: 15277
Root directory:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 1  no_addr:: 21  di_mode:: 040755  di_uid:: 0  di_gid:: 0  di_nlink:: 2  di_size:: 3864  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 21  di_goal_data:: 21  di_flags:: 0x00000001  di_payload_format:: 1200  di_height:: 0  di_depth:: 0  di_entries:: 2  di_eattr:: 0
Master dir:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 2  no_addr:: 22  di_mode:: 040755  di_uid:: 0  di_gid:: 0  di_nlink:: 2  di_size:: 3864  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 22  di_goal_data:: 22  di_flags:: 0x00000201  di_payload_format:: 1200  di_height:: 0  di_depth:: 0  di_entries:: 2  di_eattr:: 0
Super Block:
  mh_magic:: 0x01161970  mh_type:: 1  mh_format:: 100  sb_fs_format:: 1801  sb_multihost_format:: 1900  sb_bsize:: 4096  sb_bsize_shift:: 12  no_formal_ino:: 2  no_addr:: 22  no_formal_ino:: 1  no_addr:: 21  sb_lockproto:: lock_dlm  sb_locktable:: fstest:testfs
Journal 0:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 4  no_addr:: 24  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 33554432  di_blocks:: 8210  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 41  di_goal_data:: 8233  di_flags:: 0x00000200  di_payload_format:: 0  di_height:: 2  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Journal 1:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 5  no_addr:: 8234  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 33554432  di_blocks:: 8210  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 8251  di_goal_data:: 16443  di_flags:: 0x00000200  di_payload_format:: 0  di_height:: 2  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Jindex:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 3  no_addr:: 23  di_mode:: 040700  di_uid:: 0  di_gid:: 0  di_nlink:: 2  di_size:: 3864  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 23  di_goal_data:: 23  di_flags:: 0x00000201  di_payload_format:: 1200  di_height:: 0  di_depth:: 0  di_entries:: 4  di_eattr:: 0
Inum Range 0:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 7  no_addr:: 16445  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 16  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16445  di_goal_data:: 16445  di_flags:: 0x00000201  di_payload_format:: 0  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
StatFS Change 0:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 8  no_addr:: 16446  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 24  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16446  di_goal_data:: 16446  di_flags:: 0x00000201  di_payload_format:: 0  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Quota Change 0:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 9  no_addr:: 16447  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 1048576  di_blocks:: 257  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16447  di_goal_data:: 16703  di_flags:: 0x00000200  di_payload_format:: 0  di_height:: 1  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Inum Range 1:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 10  no_addr:: 16704  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 16  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16704  di_goal_data:: 16704  di_flags:: 0x00000201  di_payload_format:: 0  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
StatFS Change 1:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 11  no_addr:: 16705  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 24  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16705  di_goal_data:: 16705  di_flags:: 0x00000201  di_payload_format:: 0  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Quota Change 1:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 12  no_addr:: 16706  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 1048576  di_blocks:: 257  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16706  di_goal_data:: 16962  di_flags:: 0x00000200  di_payload_format:: 0  di_height:: 1  di_depth:: 0  di_entries:: 0  di_eattr:: 0
per_node:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 6  no_addr:: 16444  di_mode:: 040700  di_uid:: 0  di_gid:: 0  di_nlink:: 2  di_size:: 3864  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16444  di_goal_data:: 16444  di_flags:: 0x00000201  di_payload_format:: 1200  di_height:: 0  di_depth:: 0  di_entries:: 8  di_eattr:: 0
Inum Inode:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 13  no_addr:: 16963  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 0  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16963  di_goal_data:: 16963  di_flags:: 0x00000201  di_payload_format:: 0  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
StatFS Inode:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 14  no_addr:: 16964  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 0  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16964  di_goal_data:: 16964  di_flags:: 0x00000201  di_payload_format:: 0  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Resource Index:
  mh_magic:: 0x01161970  mh_type:: 4  mh_format:: 400  no_formal_ino:: 15  no_addr:: 16965  di_mode:: 0100600  di_uid:: 0  di_gid:: 0  di_nlink:: 1  di_size:: 384  di_blocks:: 1  di_atime:: 1174670528  di_mtime:: 1174670528  di_ctime:: 1174670528  di_major:: 0  di_minor:: 0  di_goal_meta:: 16965  di_goal_data:: 16965  di_flags:: 0x00000201  di_payload_format:: 1100  di_height:: 0  di_depth:: 0  di_entries:: 0  di_eattr:: 0
Root quota:
  qu_limit:: 0  qu_warn:: 0  qu_value:: 1
Next Inum: 17

Statfs:

...and that's where it stops.  To set things up, I compiled the sources, then did the following (as per the usage instructions):
1) Wrote configuration file - very simple, two host configuration.
1) Load modules gfs2, dlm, lock_dlm, and no_lock
2) Mount configfs in /sys/kernel/config
3) ccsd
4) cman_tool join
5) groupd
6) fenced
7) fence_tool join
8) dlm_controld
9) gfs_controld
10) I don't have clvmd, so I didn't start that, but the usage.txt file says it's optional.
11) mkfs.gfs2 -D -p lock_dlm -t fstest:testfs -j 2 /dev/sdb1

and it just hangs.  There are no funny messages in either dmesg or /var/log/messages - just normal cluster operation (nodes being added when I start up things on both systems, etc.  Can anyone shed any light on what might be happening here - why it's hanging up and transmitting so much data on formatting such a small volume?

Thanks,
Nick

>Hi Nick,

>Since you're building it from source anyway, I recommend recompiling mkfs.gfs2

>with gdb debugging enabled. To do that, change the Makefile so that CFLAGS has -g instead of -O2. Then make; make >install.

>Then do the command again, and when it hangs, go into gdb and see where
>it's hung.  In other words, from another terminal session:

>cd /mkfs/source/directory/  e.g. /home/devel/cluster/gfs2/mkfs/
>ps ax | grep mkfs.gfs2 (to get the pid)
>gdb ./mkfs.gfs2 <pid>
>then do a "bt" to get a call stack.
>Post the results here or email them directly to me.

>The bt output should hopefully tell me what's going on.
>If mkfs.gfs2 is broken, open up bugzilla against me and I'll fix it.

>I've used mkfs.gfs2 many times and never had it hang.

>Regards,

>Bob Peterson
>Red Hat Cluster Suite

Bob,
Thanks for the quick follow-up.  I gave this a shot, but here's the problem: the program seems to be hung on I/O, so it doesn't respond to any signals.  When I try to run gdb against the currently running PID, gdb hangs after the following line:
Attaching to program: mkfs.gfs2, process 3349

and does nothing (sometimes it doesn't even get that far).  I can't use <Ctrl-C> to kill the mkfs process, nor does it respond to a kill with a signal 9.  Also, during this mkfs.gfs2 process, the processor is being used quite heavily (80-100%) by the kernel [scsi_wq_1].  This tends to hang up things like login shells pretty badly, and the only way to get it back is to reboot the machine (in case it matters, this is inside a VMware virtual machine).  I'm going to give a couple of other things a try (like changing the elevator so that it's nicer to other processes, hopefully) and see if I can get gdb to work, but so far no such luck.

Thanks!
Nick


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]