[dm-devel] hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)

Jon Nelson jnelson at jamponi.net
Wed Dec 8 03:37:20 UTC 2010


On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o <tytso at mit.edu> wrote:
> On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote:
>> > 1. create a database (from bash):
>> >
>> > createdb test
>> >
>> > 2. place the following contents in a file (I used 't.sql'):
>> >
>> > begin;
>> > create temporary table foo as select x as a, ARRAY[x] as b FROM
>> > generate_series(1, 10000000 ) AS x;
>> > create index foo_a_idx on foo (a);
>> > create index foo_b_idx on foo USING GIN (b);
>> > rollback;
>> >
>> > 3. execute that sql:
>> >
>> > psql -f t.sql --echo-all test
>> >
>> > With 2.6.34.7 I can re-run [3] all day long, as many times as I want,
>> > without issue.
>> >
>> > With 2.6.37-rc4-13 (the currently-installed KOTD kernel) if tails
>> > pretty frequently.
>
> So I just tried to reproduce this on an Ubuntu 10.04 system running
> 2.6.37-rc5 (completely stock except for a few apparmor patches that I
> needed to keep the apparmor userspace from complaining).  I'm using
> Postgres 8.4.5-0ubuntu10.04.
>
> Using the above procedure, I wasn't able to reproduce.  Then I
> realized this might have been because I was using an SSD root file
> system (which is secured using LUKS/dm-crypt, with LVM on top of
> dm-crypt).  So I mounted a file system on a 5400 rpm SSD disk, which
> is also protected using LUKS/dm-crypt with LVM on top.  I then
> executed the PostgresQL commands:
>
> CREATE TABLESPACE test LOCATION '/kbuild/postgres';
> SET default_tablespace = test;
> COMMIT
> \quit
>
> I then re-ran the above proceduing, and verified that all of the I/O
> was going to the 5400rpm laptop disk.
>
> I then ran the above procedure a half-dozen times, and I still haven't
> been able to reproduce any Postgresql errors or kernel errors.
>
> Jon, can you help me identify what might be different with your run
> and mine?  What version of Postgres are you using?

One difference is the location of the transaction logs (pg_xlog). In
my case, /var/lib/pgsql/data *is* mountpoint for the test volume
(actually, it's a symlink to the mount point). In your case, that is
not so. Perhaps that makes a difference?  pgsql_tmp might also be on
two different volumes in your case (I can't be sure).


-- 
Jon




More information about the dm-devel mailing list