[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext3 file system becoming read only

Hi Swapana,
A update is always a good idea. On RHEL updates use to go smoothly, but I have you checked your FC switch for errors on each port? You could also check your SAN controllers, or run some diagnostics to be sure it's not a problem on your SAN. If your active controller reboots suddenly it can cause some IO errors causing your journal corruption.


Swapana Ghosh wrote:

As I explained in my first posting that the 'read-only' issue is not for one
server, it is happening for few servers which are generally 'oracle' database
oriented. Very recently it happned to an 'oracle' application server. For
temporary basis , we are re-mounting the file system and also doing fsck. While searching the redhat knowledge base, found the following url, the problem they were explaining it is similar to our issues,

It is telling that it is the bug of the kernel..

Not sure whether we will proceed for the higher version of kernel or not,
please advice.


--- tweeks <tweeks rackspace com> wrote:

The EL4 kernel is wacky when it comes the the I/O scheduler locking up and
and causing ext3 to remount RO. Various hardware hiccups can cause it to go RO.
And when it does.. you need to tread lightly or you could lose everything.

If your ext3 filesystem had problems and remounted read-only, I would
strongly advise /against/ simply fscking it. Often times when your filesystem has gone RO, it may have been that way for 30 minutes or more. Just rebooting ro

fscking is a great way to lose everything (i.e. everything being dumped into /lost+found/"

Instead, I would recommend:
1) rebooting into a rescue CD environment (not allowing the rescue
environment to mount or fsck your filesystems).
2) Nuke the ext3 journal:
	tune2fs -O ^has_journal /dev/<rootfs>
 (possibly doing the same for other problem partitions)
3) Do a fake fsck to see the extent of damage:
	fsck -fn /dev/<rootfs>
  (after checking things out.. use "-fy" once you're sure that it's safe)
4) Rebuild the journal w, "tune2fs -j /dev/<rootfs>
  (rerun at least once until "clean" result is repeatable)
5) Mount and check things out, "mkdir /mnt/tmp && mount -t ext3 /dev/<rootfs> /mnt/tmp"
6) Gracefully umount & reboot:
	"umount /mnt/tmp  && shutdown -rf now && exit"


On Tuesday 25 September 2007 11:47, Swapana Ghosh wrote:
Hi Jordi,

Thanks for your reply.  I will test the way you suggested.


--- Jordi Prats <jprats cesca es> wrote:
It seems like what it happened to me. I did this to solve this issue:

Mark the filesystem as it does not have a journal (take it to ext2)

tune2fs -O ^has_journal /dev/cciss/c0d0p2

fsck it to delete the journal:

e2fsck /dev/cciss/c0d0p2

Create the journal (take it back to ext3)

tune2fs -j /dev/cciss/c0d0p2

and finaly, remount it.

In my case it was with a local disk, but with your SAN disk should be
the same.


Swapana Ghosh wrote:

In our office environment few servers mostly  database servers and
yesterday it

for one application server(first time) the partion is getting "read

I was checking the archives, found may be similar kind of issues in the
2007-July archives.
But how it has been solved if someone describes me that will be really

In our case, just at the problem started found the line in log file as
     EXT3-fs error (device dm-12): edxt3_find_entry: reading directory

offset 2

Then one blank line
Then the line is

    Aborting journal on device dm-12.
    ext3_abort called

    Ext3-fs error (device dm-12): ext3_journal_start_sb: Detected
aborted journal
    Remounting filesysem read-only

Then the continuous line as follows:

    EXT3-fs error (device dm-12) in start_transaction: Journal has

The above message is continuous  until we remount the filesystem and


We could not figure it out what is the root cause of the system.

We are using individual EMC luns and are configured with LVM volume

then mounted on logical

Here i am giving the server description:


[root server ~]# lsmod |grep -i qla
qla2300               130304  0
qla2xxx_conf          305924  0
qla2xxx               307448  21 qla2300
scsi_mod              117709  5 sg,emcp,qla2xxx,cciss,sd_mod

[root server ~]# cat /etc/modprobe.conf
alias eth0 tg3
alias eth1 tg3
alias eth2 e1000
alias eth3 e1000
alias eth4 e1000
alias eth5 e1000
alias bond0 bonding
alias scsi_hostadapter cciss
options bond0 max_bonds=2 miimon=100 mode=1
alias scsi_hostadapter1 qla2xxx
alias scsi_hostadapter2 qla2xxx_conf
#alias scsi_hostadapter3 qla6312
options qla2xxx  ql2xmaxqdepth=16 qlport_down_retry=64
ql2xloginretrycount=30 ql2xfailover=0 ql2xlbType=0
install qla2xxx /sbin/modprobe qla2xxx_conf; /sbin/modprobe
--ignore-install qla2xxx
remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx
&& { /sbin/modprobe -r --ignore-remove qla2xxx_conf; }
include /etc/modprobe.conf.pp
include /etc/modprobe.conf.pp
include /etc/modprobe.conf.pp

[root server ~]# rpm -qa |grep -i EMC

[root server ~]# rpm -qa|grep -i scli

[root server ~]# rpm -qa|grep -i nav

 product: QLA2312 Fibre Channel Adapter

[root server ~]# rpm -qa|grep -i lvm


If I missed any info, pl. let me know.

It would be really appreciated if I get some hints to solve the issues

Thanks in advance

Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's
for today's economy) at Yahoo! Games.
=== message truncated ===

Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
Ext3-users mailing list
Ext3-users redhat com

       / /          Jordi Prats
 C E / S / C A      Dept. de Sistemes
     /_/            Centre de Supercomputació de Catalunya

 Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona
 T. 93 205 6464 · F.  93 205 6979 · jprats cesca es
[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]