[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: RAID5 gets a bad rap



Bill Davidsen wrote:
Gordon Messmer wrote:
...
No. Even in the worst case it would read N-2 blocks (you are writing a new data block and calculating new parity), and two writes.

Let's just say that I've seen controllers behave in ways that I don't understand, and that I agree, the cost should not be as great as I previously estimated.

It doesn't matter whether you're writing new files or modifying existing files, because all of this happens at the block level. It's especially bad on journalled filesystems, where writing to a file will update the files blocks, plus the filesystem's journal's blocks, and finally the filesystem's blocks.

No again. You read the parity block and the old data block, XOR first the old then the new data with the parity block, and write the new data and parity.

Yes, I understand what you're saying, but that in no way contradicts what I wrote there. Regardless of whether you create a new file or modify an existing file, there will be changes made to the filesystem to reflect the fact that changes have been made. If you modify a file, the inode's mtime is updated. If you create a new file, then a new inode is written, and the directory entry is modified. In both cases, the blocks which hold the file's data are written, the journal is written before the filesystem is updated, the filesystem is updated with the changes in the journal, and then the journal is modified again to mark it complete. We can argue about how much overhead RAID5 has, but I don't think you can argue either that there is *no* overhead or that the filesystem is not a database. Any given write to the disk will involve updating the journal twice and the filesystem once, which more or less creates the "small random writes" that RAID5 is so poor at performing.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]