ext3 file system becoming read only
Jordi Prats
jprats at cesca.es
Fri Sep 28 06:25:13 UTC 2007
Hi Swapana,
A update is always a good idea. On RHEL updates use to go smoothly, but
I have you checked your FC switch for errors on each port? You could
also check your SAN controllers, or run some diagnostics to be sure it's
not a problem on your SAN. If your active controller reboots suddenly it
can cause some IO errors causing your journal corruption.
regards,
Jordi
Swapana Ghosh wrote:
> Hi,
>
> As I explained in my first posting that the 'read-only' issue is not for one
> server, it is happening for few servers which are generally 'oracle' database
> oriented. Very recently it happned to an 'oracle' application server. For
> temporary basis , we are re-mounting the file system and also doing fsck.
> While searching the redhat knowledge base, found the following url, the problem
> they were explaining it is similar to our issues,
>
> https://bugzilla.redhat.com/show_bug.cgi?id=213921
>
> It is telling that it is the bug of the kernel..
>
> Not sure whether we will proceed for the higher version of kernel or not,
> please advice.
>
> Thanks
>
>
> --- tweeks <tweeks at rackspace.com> wrote:
>
>
>> The EL4 kernel is wacky when it comes the the I/O scheduler locking up and
>> and
>> causing ext3 to remount RO. Various hardware hiccups can cause it to go RO.
>>
>> And when it does.. you need to tread lightly or you could lose everything.
>>
>> If your ext3 filesystem had problems and remounted read-only, I would
>> strongly
>> advise /against/ simply fscking it. Often times when your filesystem has
>> gone RO, it may have been that way for 30 minutes or more. Just rebooting ro
>>
>> fscking is a great way to lose everything (i.e. everything being dumped
>> into /lost+found/"
>>
>> Instead, I would recommend:
>> 1) rebooting into a rescue CD environment (not allowing the rescue
>> environment
>> to mount or fsck your filesystems).
>> 2) Nuke the ext3 journal:
>> tune2fs -O ^has_journal /dev/<rootfs>
>> (possibly doing the same for other problem partitions)
>> 3) Do a fake fsck to see the extent of damage:
>> fsck -fn /dev/<rootfs>
>> (after checking things out.. use "-fy" once you're sure that it's safe)
>> 4) Rebuild the journal w, "tune2fs -j /dev/<rootfs>
>> (rerun at least once until "clean" result is repeatable)
>> 5) Mount and check things out,
>> "mkdir /mnt/tmp && mount -t ext3 /dev/<rootfs> /mnt/tmp"
>> 6) Gracefully umount & reboot:
>> "umount /mnt/tmp && shutdown -rf now && exit"
>>
>> Tweeks
>>
>> On Tuesday 25 September 2007 11:47, Swapana Ghosh wrote:
>>
>>> Hi Jordi,
>>>
>>> Thanks for your reply. I will test the way you suggested.
>>>
>>> Thanks
>>> -swapna
>>>
>>> --- Jordi Prats <jprats at cesca.es> wrote:
>>>
>>>> Hi,
>>>> It seems like what it happened to me. I did this to solve this issue:
>>>>
>>>> Mark the filesystem as it does not have a journal (take it to ext2)
>>>>
>>>> tune2fs -O ^has_journal /dev/cciss/c0d0p2
>>>>
>>>> fsck it to delete the journal:
>>>>
>>>> e2fsck /dev/cciss/c0d0p2
>>>>
>>>> Create the journal (take it back to ext3)
>>>>
>>>> tune2fs -j /dev/cciss/c0d0p2
>>>>
>>>> and finaly, remount it.
>>>>
>>>> In my case it was with a local disk, but with your SAN disk should be
>>>> the same.
>>>>
>>>> Jordi
>>>>
>>>> Swapana Ghosh wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> In our office environment few servers mostly database servers and
>>>>>
>>>> yesterday it
>>>>
>>>>
>>>>> happened
>>>>> for one application server(first time) the partion is getting "read
>>>>> only".
>>>>>
>>>>> I was checking the archives, found may be similar kind of issues in the
>>>>> 2007-July archives.
>>>>> But how it has been solved if someone describes me that will be really
>>>>>
>>>> helpful.
>>>>
>>>>
>>>>> In our case, just at the problem started found the line in log file as
>>>>>
>>>> follows:
>>>>
>>>>> EXT3-fs error (device dm-12): edxt3_find_entry: reading directory
>>>>>
>>>> #2015496
>>>>
>>>>
>>>>> offset 2
>>>>>
>>>>> Then one blank line
>>>>> Then the line is
>>>>>
>>>>> Aborting journal on device dm-12.
>>>>> ext3_abort called
>>>>>
>>>>> Ext3-fs error (device dm-12): ext3_journal_start_sb: Detected
>>>>> aborted journal
>>>>> Remounting filesysem read-only
>>>>>
>>>>> Then the continuous line as follows:
>>>>>
>>>>>
>>>>> EXT3-fs error (device dm-12) in start_transaction: Journal has
>>>>> aborted
>>>>>
>>>>>
>>>>>
>>>>> The above message is continuous until we remount the filesystem and
>>>>>
>>>> partion
>>>>
>>>>
>>>>> becomes
>>>>> 'read-write'.
>>>>>
>>>>> We could not figure it out what is the root cause of the system.
>>>>>
>>>>> We are using individual EMC luns and are configured with LVM volume
>>>>> groups
>>>>>
>>>> and
>>>>
>>>>
>>>>> then mounted on logical
>>>>> volumes.
>>>>>
>>>>> Here i am giving the server description:
>>>>>
>>>>> ____________________________________________________________
>>>>>
>>>>> [root at server ~]# lsmod |grep -i qla
>>>>> qla2300 130304 0
>>>>> qla2xxx_conf 305924 0
>>>>> qla2xxx 307448 21 qla2300
>>>>> scsi_mod 117709 5 sg,emcp,qla2xxx,cciss,sd_mod
>>>>>
>>>>> ____________________________________________________________
>>>>> [root at server ~]# cat /etc/modprobe.conf
>>>>> alias eth0 tg3
>>>>> alias eth1 tg3
>>>>> alias eth2 e1000
>>>>> alias eth3 e1000
>>>>> alias eth4 e1000
>>>>> alias eth5 e1000
>>>>> alias bond0 bonding
>>>>> alias scsi_hostadapter cciss
>>>>> options bond0 max_bonds=2 miimon=100 mode=1
>>>>> alias scsi_hostadapter1 qla2xxx
>>>>> alias scsi_hostadapter2 qla2xxx_conf
>>>>> #alias scsi_hostadapter3 qla6312
>>>>> options qla2xxx ql2xmaxqdepth=16 qlport_down_retry=64
>>>>> ql2xloginretrycount=30 ql2xfailover=0 ql2xlbType=0
>>>>> install qla2xxx /sbin/modprobe qla2xxx_conf; /sbin/modprobe
>>>>> --ignore-install qla2xxx
>>>>> remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx
>>>>> && { /sbin/modprobe -r --ignore-remove qla2xxx_conf; }
>>>>> ###BEGINPP
>>>>> include /etc/modprobe.conf.pp
>>>>> ###ENDPP
>>>>> ###BEGINPP
>>>>> include /etc/modprobe.conf.pp
>>>>> ###ENDPP
>>>>> ###BEGINPP
>>>>> include /etc/modprobe.conf.pp
>>>>> ###ENDPP
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa |grep -i EMC
>>>>> EMCpower.LINUX-4.5.1-022
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa|grep -i scli
>>>>> scli-1.06.16-57
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa|grep -i nav
>>>>> naviagentcli-6.19.1.3.0-1
>>>>>
>>>>> ________________________________________________
>>>>> product: QLA2312 Fibre Channel Adapter
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa|grep -i lvm
>>>>> lvm2-2.02.06-6.0.RHEL4
>>>>> system-config-lvm-1.0.19-1.0
>>>>>
>>>>> ________________________________________________
>>>>>
>>>>> If I missed any info, pl. let me know.
>>>>>
>>>>> It would be really appreciated if I get some hints to solve the issues
>>>>>
>>>>> Thanks in advance
>>>>> -swapana
>>>>>
>>> ___________________________________________________________________________
>>> _________
>>>
>>>
>>>>> Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's
>>>>> updated
>>>>>
>>>> for today's economy) at Yahoo! Games.
>>>>
> === message truncated ===
>
>
>
> ____________________________________________________________________________________
> Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
> http://tv.yahoo.com/
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>
>
>
--
......................................................................
__
/ / Jordi Prats
C E / S / C A Dept. de Sistemes
/_/ Centre de Supercomputació de Catalunya
Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona
T. 93 205 6464 · F. 93 205 6979 · jprats at cesca.es
......................................................................
More information about the Ext3-users
mailing list