[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Kernel panic: GFS: Assertion failed on line 550 of file rgrp.c



Hi Kevin,

Thx for your real fast reply :). The result of running gfs_fsck with serveral "v" is below:

[root rac3 root]# gfs_fsck -vvvvvvvvvy /dev/pool/oracle_u02
Initializing fsck
Initializing lists...
Initializing special inodes...
(file.c:45)     readi:  Offset (320) is >= the file size (320).
(super.c:211)   4 journals found.
(file.c:45)     readi:  Offset (45888) is >= the file size (45888).
(super.c:268)   478 resource groups found.
(util.c:112)    For 65773 Expected 1161970:2 - got 0:0
Buffer #65773 (1 of 5) is neither GFS_METATYPE_RB nor GFS_METATYPE_RG.
Resource group is corrupted.
Unable to read in rgrp descriptor.
Unable to fill in resource group information.
(initialize.c:364)      <backtrace> - init_sbp()

>It would be interesting to see if the partitions are identical after the
>snapshot.  How large are the LUNs?  Can you do a comparison of the
>volumes?  I would do those steps first before the fsck.  It is possible
>you have a problem with the oracle_u02, so would be interesting to run
>gfs_fsck if the snapped LUNs are identical.

We use the SnapView feature of our EMC CX500 SAN so those two LUNs _should_ be identical. In fact, we have cloned other GFS LUNs many times in the past without no problem. Tomorrow we'll drop the destination LUN and try again if gfs_fsck can not help.

Regards,

Thai Duong.

On 2/10/06, Kevin Anderson <kanderso redhat com> wrote:
On Fri, 2006-02-10 at 22:59 +0700, Thai Duong wrote:
> Hi Kevin,
>
> I did unmount oracle_u02 before cloning but still no luck.
> When I tried to run gfs_fsck against oracle_u02 on the backup
> cluster's node, it reported something like below:
>
> [root rac3 root]# gfs_fsck -y /dev/pool/oracle_u02
> Initializing fsck
> Buffer #65773 (1 of 5) is neither GFS_METATYPE_RB nor GFS_METATYPE_RG.

Add some -vvvvvvv flags to the gfs_fsck command line. Each "v" adds
another layer of messages.  The DEBUG messages are at layer 7. This
should print out more information about the resource group that it is
failing to read.

> Resource group is corrupted.
> Unable to read in rgrp descriptor.
> Unable to fill in resource group information.
>
> It seems that oracle_u02 somehow got broken. Running gfs_fsck against
> oracle_u01 works like a charm. Do i need to run gfs_fsck against the
> original oracle_u02? Please advise.

It would be interesting to see if the partitions are identical after the
snapshot.  How large are the LUNs?  Can you do a comparison of the
volumes?  I would do those steps first before the fsck.  It is possible
you have a problem with the oracle_u02, so would be interesting to run
gfs_fsck if the snapped LUNs are identical.

Kevin





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]