[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: 2GB memory limit running fsck on a +6TB device



On Jun 09, 2008  19:33 +0200, santi usansolo net wrote:
> That's the scenario: +6TB device on a 3ware 9550SX RAID controller, running
> Debian Etch 32bits, with 2.6.25.4 kernel, and defaults e2fsprogs version,
> "1.39+1.40-WIP-2006.11.14+dfsg-2etch1".
> 
> Running "tune2fs" returns that filesystem is in EXT3_ERROR_FS state, "clean
> with errors":
> 
> # tune2fs -l /dev/sda4   
> tune2fs 1.40.10 (21-May-2008)
> Filesystem volume name:   <none>
> Last mounted on:          <not available>
> Filesystem UUID:          7701b70e-f776-417b-bf31-3693dba56f86
> Filesystem magic number:  0xEF53
> Filesystem revision #:    1 (dynamic)
> Filesystem features:      has_journal dir_index filetype needs_recovery
> sparse_super large_file
> Default mount options:    (none)
> Filesystem state:         clean with errors
> Errors behavior:          Continue
> Filesystem OS type:       Linux
> Inode count:              792576000
> Block count:              1585146848
> 
> It's a backup storage server, with more than 113 million files, this's the
> output of "df -i":
> 
> # df -i /backup/
> Filesystem            Inodes   IUsed   IFree IUse% Mounted on
> /dev/sda4            792576000 113385959 679190041   15% /backup
> 
> 
> Running fsck.ext3 or  fsck.ext2 I get:
> 
> # fsck.ext3 /dev/sda4
> e2fsck 1.40.10 (21-May-2008)
> Adding dirhash hint to filesystem.
> 
> /dev/sda4 contains a file system with errors, check forced.
> Pass 1: Checking inodes, blocks, and sizes

I recall that e2fsck allocates on the order of 3 * block_count / 8 bytes,
and 5 * inode_count / 8 bytes, so in your case this is about:

(5 * 1585146848 + 3 * 792576000) / 8 = 1287932780 bytes = 1.2GB

at a minimum, but my estimates might be incorrect.

> mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x404fa000

Judging by the return values of these functions, this is a 32-bit system,
and it is entirely possible that you are exceeding the per-process memory
allocation limit.

> mmap2(NULL, 748892160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x63be2000
> mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = -1 ENOMEM (Cannot allocate memory)

Hmm, it seems a bit excessive to allocate 1.8GB in a single chunk.

> Error allocating directory block array: Memory allocation failed
> e2fsck: aborted

This message is a bit tricky to nail down because it doesn't exist anywhere
in the code directly.  It is encoded into "e2fsck abbreviations", and
the expansion that is normally in the corresponding comment is different.
It is PR_1_ALLOCATE_DBCOUNT returned from the call chain:
	ext2fs_init_dblist->
	  make_dblist->
	    ext2fs_get_num_dirs()

which is counting the number of directories in the filesystem, and allocating
two 12-byte array element for each one.  This implies you have 77M directories
in your filesystem, or an average of only 10 files per directory?

> Appears that fsck is trying to use more than 2GB memory to store inode
> table relationship. System has 4GB of physical RAM and 4GB of swap, is
> there anyway to limit the memory used by fsck or any solution to check this
> filesystem?

I don't know offhand how important the dblist structure is, so I'm not
sure if there is a way to reduce the memory usage for it.  I believe
that in low-memory situations it is possible to use tdb in newer versions
of e2fsck for the dblist, but I don't know much of the details.

> Running fsck with a 64bit LiveCD will solve the problem?

Yes, I suspect with a 64-bit kernel you could allocate the full 4GB of RAM
for e2fsck and be able to check the filesystem.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]