[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext3 file system becoming read only



Hi Swapana,
A update is always a good idea. On RHEL updates use to go smoothly, but I have you checked your FC switch for errors on each port? You could also check your SAN controllers, or run some diagnostics to be sure it's not a problem on your SAN. If your active controller reboots suddenly it can cause some IO errors causing your journal corruption.

regards,
Jordi



Swapana Ghosh wrote:
Hi,

As I explained in my first posting that the 'read-only' issue is not for one
server, it is happening for few servers which are generally 'oracle' database
oriented. Very recently it happned to an 'oracle' application server. For
temporary basis , we are re-mounting the file system and also doing fsck. While searching the redhat knowledge base, found the following url, the problem they were explaining it is similar to our issues,
https://bugzilla.redhat.com/show_bug.cgi?id=213921

It is telling that it is the bug of the kernel..

Not sure whether we will proceed for the higher version of kernel or not,
please advice.

Thanks


--- tweeks <tweeks rackspace com> wrote:

The EL4 kernel is wacky when it comes the the I/O scheduler locking up and
and causing ext3 to remount RO. Various hardware hiccups can cause it to go RO.
And when it does.. you need to tread lightly or you could lose everything.

If your ext3 filesystem had problems and remounted read-only, I would
strongly advise /against/ simply fscking it. Often times when your filesystem has gone RO, it may have been that way for 30 minutes or more. Just rebooting ro

fscking is a great way to lose everything (i.e. everything being dumped into /lost+found/"

Instead, I would recommend:
1) rebooting into a rescue CD environment (not allowing the rescue
environment to mount or fsck your filesystems).
2) Nuke the ext3 journal:
	tune2fs -O ^has_journal /dev/<rootfs>
 (possibly doing the same for other problem partitions)
3) Do a fake fsck to see the extent of damage:
	fsck -fn /dev/<rootfs>
  (after checking things out.. use "-fy" once you're sure that it's safe)
4) Rebuild the journal w, "tune2fs -j /dev/<rootfs>
  (rerun at least once until "clean" result is repeatable)
5) Mount and check things out, "mkdir /mnt/tmp && mount -t ext3 /dev/<rootfs> /mnt/tmp"
6) Gracefully umount & reboot:
	"umount /mnt/tmp  && shutdown -rf now && exit"

Tweeks

On Tuesday 25 September 2007 11:47, Swapana Ghosh wrote:
Hi Jordi,

Thanks for your reply.  I will test the way you suggested.

Thanks
-swapna

--- Jordi Prats <jprats cesca es> wrote:
Hi,
It seems like what it happened to me. I did this to solve this issue:

Mark the filesystem as it does not have a journal (take it to ext2)

tune2fs -O ^has_journal /dev/cciss/c0d0p2

fsck it to delete the journal:

e2fsck /dev/cciss/c0d0p2

Create the journal (take it back to ext3)

tune2fs -j /dev/cciss/c0d0p2

and finaly, remount it.

In my case it was with a local disk, but with your SAN disk should be
the same.

Jordi

Swapana Ghosh wrote:
Hi

In our office environment few servers mostly  database servers and
yesterday it

happened
for one application server(first time) the partion is getting "read
only".

I was checking the archives, found may be similar kind of issues in the
2007-July archives.
But how it has been solved if someone describes me that will be really
helpful.

In our case, just at the problem started found the line in log file as
follows:
     EXT3-fs error (device dm-12): edxt3_find_entry: reading directory
#2015496

offset 2

Then one blank line
Then the line is

    Aborting journal on device dm-12.
    ext3_abort called

    Ext3-fs error (device dm-12): ext3_journal_start_sb: Detected
aborted journal
    Remounting filesysem read-only

Then the continuous line as follows:


    EXT3-fs error (device dm-12) in start_transaction: Journal has
aborted



The above message is continuous  until we remount the filesystem and
partion

becomes
'read-write'.

We could not figure it out what is the root cause of the system.

We are using individual EMC luns and are configured with LVM volume
groups
and

then mounted on logical
volumes.

Here i am giving the server description:

____________________________________________________________

[root server ~]# lsmod |grep -i qla
qla2300               130304  0
qla2xxx_conf          305924  0
qla2xxx               307448  21 qla2300
scsi_mod              117709  5 sg,emcp,qla2xxx,cciss,sd_mod

____________________________________________________________
[root server ~]# cat /etc/modprobe.conf
alias eth0 tg3
alias eth1 tg3
alias eth2 e1000
alias eth3 e1000
alias eth4 e1000
alias eth5 e1000
alias bond0 bonding
alias scsi_hostadapter cciss
options bond0 max_bonds=2 miimon=100 mode=1
alias scsi_hostadapter1 qla2xxx
alias scsi_hostadapter2 qla2xxx_conf
#alias scsi_hostadapter3 qla6312
options qla2xxx  ql2xmaxqdepth=16 qlport_down_retry=64
ql2xloginretrycount=30 ql2xfailover=0 ql2xlbType=0
install qla2xxx /sbin/modprobe qla2xxx_conf; /sbin/modprobe
--ignore-install qla2xxx
remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx
&& { /sbin/modprobe -r --ignore-remove qla2xxx_conf; }
###BEGINPP
include /etc/modprobe.conf.pp
###ENDPP
###BEGINPP
include /etc/modprobe.conf.pp
###ENDPP
###BEGINPP
include /etc/modprobe.conf.pp
###ENDPP

________________________________________________
[root server ~]# rpm -qa |grep -i EMC
EMCpower.LINUX-4.5.1-022

________________________________________________
[root server ~]# rpm -qa|grep -i scli
scli-1.06.16-57

________________________________________________
[root server ~]# rpm -qa|grep -i nav
naviagentcli-6.19.1.3.0-1

________________________________________________
 product: QLA2312 Fibre Channel Adapter

________________________________________________
[root server ~]# rpm -qa|grep -i lvm
lvm2-2.02.06-6.0.RHEL4
system-config-lvm-1.0.19-1.0

________________________________________________

If I missed any info, pl. let me know.

It would be really appreciated if I get some hints to solve the issues

Thanks in advance
-swapana
___________________________________________________________________________
_________

Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's
updated
for today's economy) at Yahoo! Games.
=== message truncated ===



      ____________________________________________________________________________________
Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
http://tv.yahoo.com/
_______________________________________________
Ext3-users mailing list
Ext3-users redhat com
https://www.redhat.com/mailman/listinfo/ext3-users




--
......................................................................
        __
       / /          Jordi Prats
 C E / S / C A      Dept. de Sistemes
     /_/            Centre de Supercomputació de Catalunya

 Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona
 T. 93 205 6464 · F.  93 205 6979 · jprats cesca es
......................................................................
[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]