ext3 file system becoming read only

Jordi Prats jprats at cesca.es
Fri Sep 28 06:25:13 UTC 2007


Hi Swapana,
A update is always a good idea. On RHEL updates use to go smoothly, but 
I have you checked your FC switch for errors on each port? You could 
also check your SAN controllers, or run some diagnostics to be sure it's 
not a problem on your SAN. If your active controller reboots suddenly it 
can cause some IO errors causing your journal corruption.

regards,
Jordi



Swapana Ghosh wrote:
> Hi,
>
> As I explained in my first posting that the 'read-only' issue is not for one
> server, it is happening for few servers which are generally 'oracle' database
> oriented. Very recently it happned to an 'oracle' application server. For
> temporary basis , we are re-mounting the file system and also doing fsck.   
> While searching the redhat knowledge base, found the following url, the problem
> they were explaining it is similar to our issues, 
>
> https://bugzilla.redhat.com/show_bug.cgi?id=213921
>
> It is telling that it is the bug of the kernel..
>
> Not sure whether we will proceed for the higher version of kernel or not,
> please advice.
>
> Thanks
>
>
> --- tweeks <tweeks at rackspace.com> wrote:
>
>   
>> The EL4 kernel is wacky when it comes the the I/O scheduler locking up and
>> and 
>> causing ext3 to remount RO.  Various hardware hiccups can cause it to go RO. 
>>
>> And when it does.. you need to tread lightly or you could lose everything.
>>
>> If your ext3 filesystem had problems and remounted read-only, I would
>> strongly 
>> advise /against/ simply fscking it.  Often times when your filesystem has 
>> gone RO, it may have been that way for 30 minutes or more.  Just rebooting ro
>>
>> fscking is a great way to lose everything (i.e. everything being dumped 
>> into /lost+found/"
>>
>> Instead, I would recommend:
>> 1) rebooting into a rescue CD environment (not allowing the rescue
>> environment 
>> to mount or fsck your filesystems).
>> 2) Nuke the ext3 journal:
>> 	tune2fs -O ^has_journal /dev/<rootfs>
>>  (possibly doing the same for other problem partitions)
>> 3) Do a fake fsck to see the extent of damage:
>> 	fsck -fn /dev/<rootfs>
>>   (after checking things out.. use "-fy" once you're sure that it's safe)
>> 4) Rebuild the journal w, "tune2fs -j /dev/<rootfs>
>>   (rerun at least once until "clean" result is repeatable)
>> 5) Mount and check things out, 
>> 	"mkdir /mnt/tmp && mount -t ext3 /dev/<rootfs> /mnt/tmp"
>> 6) Gracefully umount & reboot:
>> 	"umount /mnt/tmp  && shutdown -rf now && exit"
>>
>> Tweeks
>>
>> On Tuesday 25 September 2007 11:47, Swapana Ghosh wrote:
>>     
>>> Hi Jordi,
>>>
>>> Thanks for your reply.  I will test the way you suggested.
>>>
>>> Thanks
>>> -swapna
>>>
>>> --- Jordi Prats <jprats at cesca.es> wrote:
>>>       
>>>> Hi,
>>>> It seems like what it happened to me. I did this to solve this issue:
>>>>
>>>> Mark the filesystem as it does not have a journal (take it to ext2)
>>>>
>>>> tune2fs -O ^has_journal /dev/cciss/c0d0p2
>>>>
>>>> fsck it to delete the journal:
>>>>
>>>> e2fsck /dev/cciss/c0d0p2
>>>>
>>>> Create the journal (take it back to ext3)
>>>>
>>>> tune2fs -j /dev/cciss/c0d0p2
>>>>
>>>> and finaly, remount it.
>>>>
>>>> In my case it was with a local disk, but with your SAN disk should be
>>>> the same.
>>>>
>>>> Jordi
>>>>
>>>> Swapana Ghosh wrote:
>>>>         
>>>>> Hi
>>>>>
>>>>> In our office environment few servers mostly  database servers and
>>>>>           
>>>> yesterday it
>>>>
>>>>         
>>>>> happened
>>>>> for one application server(first time) the partion is getting "read
>>>>> only".
>>>>>
>>>>> I was checking the archives, found may be similar kind of issues in the
>>>>> 2007-July archives.
>>>>> But how it has been solved if someone describes me that will be really
>>>>>           
>>>> helpful.
>>>>
>>>>         
>>>>> In our case, just at the problem started found the line in log file as
>>>>>           
>>>> follows:
>>>>         
>>>>>      EXT3-fs error (device dm-12): edxt3_find_entry: reading directory
>>>>>           
>>>> #2015496
>>>>
>>>>         
>>>>> offset 2
>>>>>
>>>>> Then one blank line
>>>>> Then the line is
>>>>>
>>>>>     Aborting journal on device dm-12.
>>>>>     ext3_abort called
>>>>>
>>>>>     Ext3-fs error (device dm-12): ext3_journal_start_sb: Detected
>>>>> aborted journal
>>>>>     Remounting filesysem read-only
>>>>>
>>>>> Then the continuous line as follows:
>>>>>
>>>>>
>>>>>     EXT3-fs error (device dm-12) in start_transaction: Journal has
>>>>> aborted
>>>>>
>>>>>
>>>>>
>>>>> The above message is continuous  until we remount the filesystem and
>>>>>           
>>>> partion
>>>>
>>>>         
>>>>> becomes
>>>>> 'read-write'.
>>>>>
>>>>> We could not figure it out what is the root cause of the system.
>>>>>
>>>>> We are using individual EMC luns and are configured with LVM volume
>>>>> groups
>>>>>           
>>>> and
>>>>
>>>>         
>>>>> then mounted on logical
>>>>> volumes.
>>>>>
>>>>> Here i am giving the server description:
>>>>>
>>>>> ____________________________________________________________
>>>>>
>>>>> [root at server ~]# lsmod |grep -i qla
>>>>> qla2300               130304  0
>>>>> qla2xxx_conf          305924  0
>>>>> qla2xxx               307448  21 qla2300
>>>>> scsi_mod              117709  5 sg,emcp,qla2xxx,cciss,sd_mod
>>>>>
>>>>> ____________________________________________________________
>>>>> [root at server ~]# cat /etc/modprobe.conf
>>>>> alias eth0 tg3
>>>>> alias eth1 tg3
>>>>> alias eth2 e1000
>>>>> alias eth3 e1000
>>>>> alias eth4 e1000
>>>>> alias eth5 e1000
>>>>> alias bond0 bonding
>>>>> alias scsi_hostadapter cciss
>>>>> options bond0 max_bonds=2 miimon=100 mode=1
>>>>> alias scsi_hostadapter1 qla2xxx
>>>>> alias scsi_hostadapter2 qla2xxx_conf
>>>>> #alias scsi_hostadapter3 qla6312
>>>>> options qla2xxx  ql2xmaxqdepth=16 qlport_down_retry=64
>>>>> ql2xloginretrycount=30 ql2xfailover=0 ql2xlbType=0
>>>>> install qla2xxx /sbin/modprobe qla2xxx_conf; /sbin/modprobe
>>>>> --ignore-install qla2xxx
>>>>> remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx
>>>>> && { /sbin/modprobe -r --ignore-remove qla2xxx_conf; }
>>>>> ###BEGINPP
>>>>> include /etc/modprobe.conf.pp
>>>>> ###ENDPP
>>>>> ###BEGINPP
>>>>> include /etc/modprobe.conf.pp
>>>>> ###ENDPP
>>>>> ###BEGINPP
>>>>> include /etc/modprobe.conf.pp
>>>>> ###ENDPP
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa |grep -i EMC
>>>>> EMCpower.LINUX-4.5.1-022
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa|grep -i scli
>>>>> scli-1.06.16-57
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa|grep -i nav
>>>>> naviagentcli-6.19.1.3.0-1
>>>>>
>>>>> ________________________________________________
>>>>>  product: QLA2312 Fibre Channel Adapter
>>>>>
>>>>> ________________________________________________
>>>>> [root at server ~]# rpm -qa|grep -i lvm
>>>>> lvm2-2.02.06-6.0.RHEL4
>>>>> system-config-lvm-1.0.19-1.0
>>>>>
>>>>> ________________________________________________
>>>>>
>>>>> If I missed any info, pl. let me know.
>>>>>
>>>>> It would be really appreciated if I get some hints to solve the issues
>>>>>
>>>>> Thanks in advance
>>>>> -swapana
>>>>>           
>>> ___________________________________________________________________________
>>> _________
>>>
>>>       
>>>>> Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's
>>>>> updated
>>>>>           
>>>> for today's economy) at Yahoo! Games.
>>>>         
> === message truncated ===
>
>
>
>       ____________________________________________________________________________________
> Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
> http://tv.yahoo.com/ 
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>
>
>   


-- 
......................................................................
         __
        / /          Jordi Prats
  C E / S / C A      Dept. de Sistemes
      /_/            Centre de Supercomputació de Catalunya

  Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona
  T. 93 205 6464 · F.  93 205 6979 · jprats at cesca.es
...................................................................... 




More information about the Ext3-users mailing list