[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

FS corruption; HTREE-related?



Over the last two days we've been seeing a fair bit of this:

----
# ls -laR > /dev/null
...
ls: ./server2/b/user/bxyz/392.: Input/output error
----

This is with the latest htree patches applied to 2.4.19, and latest
e2fsprogs-test, on a dual AMD system, with 5x73GB SCSI drives on a
MegaRAID controller. We're using the gcc 2.96 that comes with RH7.3.

esfsck shows "Inodes that were part of a corrupted orphan linked list
found."

We've been hitting this computer pretty hard, migrating data across to it
from 4 servers simultaneously using rsync. Of around a million files or
so, 250 developed this problem.

Here's some more diagnostics:

----
[root server5 data]# ls -laR > /dev/null
...
ls: ./server2/b/user/bxyz/392.: Input/output error
...

[root server5 all]# cd /var/data/server2/b/user/bxyz/
[root server5 bxyz]# ls -l 391.
-rw-------    1 cyrus    cyrus       66274 Mar 22  2002 391.
[root server5 bxyz]# ls -l 392.
ls: 392.: Input/output error
[root server5 bxyz]# debugfs /dev/sda2
debugfs 1.30-WIP (30-Sep-2002)
debugfs:  cd /var/data/server2/b/user/bxyz/
debugfs:  ls -l
14991599   40700 (2)    504      0   20480  4-Oct-2002 21:00 .
34226198   40755 (2)    504     12   32768  6-Oct-2002 20:33 ..
...
14992584  100600 (1)    504    505   66274 22-Mar-2002 09:32 391.
...
14992585  100600 (1)    504    505       0  6-Oct-2002 17:52 392.
...
debugfs:  stat 392.
Inode: 14992585   Type: regular    Mode:  0600   Flags: 0x0   Generation:
2449155561
User:   504   Group:   505   Size: 0
File ACL: 0    Directory ACL: 0
Links: 0   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
atime: 0x3da05967 -- Sun Oct  6 10:40:23 2002
mtime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
dtime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
BLOCKS:

debugfs:  stat 391.
Inode: 14992584   Type: regular    Mode:  0600   Flags: 0x0   Generation:
2449155559
User:   504   Group:   505   Size: 66274
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 144
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x3da05967 -- Sun Oct  6 10:40:23 2002
atime: 0x3da05967 -- Sun Oct  6 10:40:23 2002
mtime: 0x3c9b4e7c -- Fri Mar 22 09:32:12 2002
BLOCKS:
(0-11):29991107-29991118, (IND):29991119, (12-16):29991120-29991124
TOTAL: 18

debugfs:  ncheck 14992584
Inode   Pathname
14992584        /var/data/server2/b/user/bxyz/391.
debugfs:  ncheck 14992585
Inode   Pathname
14992585        /var/data/server2/b/user/bxyz/392.
debugfs:  stat 392.
Inode: 14992585   Type: regular    Mode:  0600   Flags: 0x0   Generation:
2449155561
User:   504   Group:   505   Size: 0
File ACL: 0    Directory ACL: 0
Links: 0   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
atime: 0x3da05967 -- Sun Oct  6 10:40:23 2002
mtime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
dtime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
BLOCKS:

debugfs:  testi 392.
Inode 14992585 is not in use
debugfs:  testi 391.
Inode 14992584 is marked in use
debugfs:  quit
[root server5 bxyz]# ls -l 391.
-rw-------    1 cyrus    cyrus       66274 Mar 22  2002 391.
[root server5 bxyz]# ls -l 392.
ls: 392.: Input/output error
[root server5 bxyz]# strace ls -l 392.
...
lstat64("392.", 0x8053674)              = -1 EIO (Input/output error)
write(2, "ls: ", 4ls: )                     = 4
write(2, "392.", 4392.)                     = 4
write(2, ": Input/output error", 20: Input/output error)    = 20
write(2, "\n", 1
)                       = 1
_exit(1)                                = ?
----

Any ideas on what's causing this? e2fsck causes the problem files to be
removed. For now we've disabled directory indexing--if the problem
continues after doing this, I'll update the list with the details.

BTW, now that I've disabled directory indexing, will folders with the
relevent flag already set still use hashed indexes?





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]