[dm-devel] Shared snapshot tests

Wed May 5 03:39:19 UTC 2010

On Wed, 21 Apr 2010, Daire Byrne wrote:

> Mikulas,
> 
> On Tue, Apr 20, 2010 at 7:58 AM, Mikulas Patocka <mpatocka at redhat.com> wrote:
> >>   (2) similarly why does the read performance change at all
> >> (214->127MB/s). There is no COW overhead. This is the case for both
> >> the old snapshots and the new shared ones.
> >
> > I am thinking that it could be because I/Os (including reads) are split at
> > chunk size boundary. But then, it would be dependent on chunk size --- and
> > it isn't.
> >
> > Try this:
> > Don't use snapshots and load plain origin target manually with dmsetup:
> > dmsetup create origin --table "0 `blockdev --getsize /dev/sda1` snapshot-origin /dev/sda1"
> > (replace /dev/sda1 with the real device)
> > Now, /dev/mapper/origin and /dev/sda1 contain identical data.
> > Can you see 214->127MB/s read performance drop in /dev/mapper/origin?
> 
> No I don't see the drop in read performance in this case.

See the patch that I sent you. Does it improve performance for NON-shared 
snapshots? (eventually, I'll make the same change to shared snapshots too)

BTW. what disks and raid array do you use? Is it a hardware or software 
raid?

I have found that with Maxtor disks (IDE, SATA and SCSI), you have to 
limit the request size to 256kB, or you get unpredictable performance 
drops.

> > Compare /sys/block/dm-X/queue content for the device if no snapshot is
> > loaded and if some snapshot is loaded. Is there a difference? What if you
> > manually set the values to be the same? (i.e. tweak max_sectors_kb or
> > others)
> 
> It looks like max_sectors_kb goes from 512 -> 4 when a snapshot is
> done. But increasing this manually doesn't seem to have much effect.
> Actually increasing it seems to hurt read performance even more. No
> other changes in nr_requests, read_ahead etc.
> 
> > Would it make sense to limit this write-holding? I think no, because it
> > wouldn't improve i/o latency. It would just make i/o latency less
> > variable. Can you think of an application where high i/o latency doesn't
> > matter and variable i/o latency does matter?
> 
> Well the only thing I have experience with is LustreFS. I have used
> the "old" snapshots with that and the server does tend to trigger lots
> of alerts if it can't commit by a certain time (e.g. when a big COW
> operation is happening). I just figure that with a fast RAID (or SSDs)
> that writing the COW and new data to the drive at the same time
> shouldn't incur such a massive seek hit whilst making the performance
> more even/predictable.
> 
> >>   (4) why is there a small (but appreciable) drop in writes as the
> >> number of snapshots increase? It should only have to do a single COW
> >> in all cases no?
> >
> > Yes, it does just one cow and it uses ranges, so the data structures have
> > no overhead for multiple snapshots.
> >
> > Did you recreate the environment from scratch? (both the filesystem and
> > the whole snapshot shared store)
> >
> > The shared snapshot store writes continuously forward and if you didn't
> > recreate it, it may be just increasing disk seek times as it moves to the
> > device end.
> >
> > A filesystem may be also writing to different places, so you'd better
> > recreate it too.
> 
> Yes I think something like this might have happened. But now after
> coming back to retest after a reboot I'm getting much better write
> performance (~90MB/s instead of ~38MB/s) with N snapshots. It seems
> like my previous write results were too low for some reason. So now my
> RAID does 308MB/s writes + 214MB/s reads without snapshots and 90MB/s
> writes + 127MB/s reads with shared snapshots - not too bad. I will
> poke around some more.
> 
> Out of interest what are Redhat's plans for this feature? Will it be
> in RHEL6 or is it going to wait until it is accepted upstream and then
> make it into some later Fedora release? Could it be easily backported
> to RHEL5 do you think?

Not in RHEL 6.0. Maybe in 6.1.

It's not planned to backport to 5.x.

> Daire

Mikulas