[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] dm-cache: dm-3.14-fixes-4

Hi Mike,
Thank you a lot for your very fast response. I'll backport  commit e0d849fad7 to 3.11.10 and will let you know how dm-cache behaves on large cache SSD. Probably I will check the code for other truncation issues.

On Mon, Mar 17, 2014 at 4:01 PM, Mike Snitzer <snitzer redhat com> wrote:
On Mon, Mar 17 2014 at  9:43am -0400,
George . <george ucdn com> wrote:

> Hi,
> In dm-3.14-fixes-4, there is a description that :
> - fix corruption with >2TB fast device due to truncation bug
> But looking at the diffidence I can't find anything related to such bug.

Commit 8b9d96666529 ("dm cache: fix truncation bug when copying a block
to/from >2TB fast device") follows the same pattern as commit e0d849fad7
("dm cache: fix truncation bug when mapping I/O to >2TB fast device").
Which is that from_cblock() only returns a 32bit value, so any 64bit
math operation must use a type that can accomodate 64bit.  That is why
an intermediate sector_t value is now used in both commits.

> I'm asking this, because we are trying to use dm-cache on machine with 2.4
> TB SDD cache and after I took following fix:
> dm-3.14-fixes-1
> dm cache: fix truncation bug when mapping I/O to >2TB fast device
> dm-3.14-fixes-1<http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/tag/?id=dm-3.14-fixes-1>
> our cached device got corrupted again.

Commit e0d849fad7 wouldn't have been the cause.  If you didn't also
apply 8b9d96666529 then you could have hit that one.

> My question is: is there another truncation bug discovered?

Yeah, both the above referenced commits (commit 8b9d96666529 being the
most recent).

> I've back ported  dm-3.14-fixes-1 to 3.11.10 kernel, because when we tested
> v3.14-rc5<http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/tag/?id=v3.14-rc5>
> -
> cached device was corrupted after ~15 minutes and seems to be more
> unstable.

OK, well upstream dm-cache saw very little change for 3.14.  Just a
handful of bug fixes.  So you're likely hitting an outstanding bug that
we've yet to fix.  One issue that is being actively pursued is the
thought that discards could be contributing to corruption.  Heinz will
have an update on this line of discovery soon.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]