[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] RE: [ADMIN] [PERFORM] backup/restore - another area.



There's been a lot of discussion on the ADMIN list about postgresql backups
from LVM snapshots:
http://marc.theaimsgroup.com/?l=postgresql-admin&w=2&r=1&s=LVM+snapshot&q=b

Note that the existence of the snapshot slows the original filesystem down,
so you want to minimize the duration for which the snapshot exists. A two
phase rsync -- first off the "live" filesystem, second off the snapshot --
to a backup filesystem accomplishes this. If you don't have the capacity to
duplicate your $PGDATA folder, you'll want to consider doing incremental
backups with xfsdump. Since xfs and xfsdump has been around forever, so you
can bet your data on them; of course, xfsdump requires that pgdata be hosted
on an xfs volume (XFS is a fast filesystem with metadata journaling, giving
fast crash recovery, so this is a good idea anyway). For other filesystems,
incremental backup and restores are being developed in star:
http://freshmeat.net/projects/star/. star also claims to be faster than gnu
tar, so you might use it even if you aren't doing incremental backups.

The latest version of my backup script is attached. It demonstrates the
implementation of filesystem level backup of the postgresql data cluster
(the $PGDATA folder) using two-phase rsync. The $PGDATA folder is on an xfs
formatted LVM volume; various checks are done before and after backup, and
errors are e-mailed to a specified account. The script handles situations
where (i) the XFS filesystem containing $PGDATA has an external log and (ii)
the postmaster log ($PGDATA/pg_xlog) is written to a filesystem different
than the one containing the $PGDATA folder. These configurations enhance
database performance, though an external XFS log is not a big win for
postgresql, which creates relatively few new files and deletes relatively
few files (relative to an e-mail server). It should be possible, using this
script, to keep backup times below 10 minutes even for very high loads -
just increase the frequency at which you do backups (it has been tested to
run hourly, and runs every three hours on the production server).

Cheers,
	Murthy

(Note: I have experienced filesystem hangs within 2 days to a week, from
running this script frequently with XFS versions including XFS 1.3. The XFS
CVS kernel from Sep 30th 2003 seems not to have this problem - I have been
running this script hourly for two weeks without problems. So you might
either use a CVS kernel or wait for an XFS release based on linux 2.4.22 or
later; and you will be testing your own setup, right?!)






>-----Original Message-----
>From: Jeff [mailto:threshar torgo 978 org]
>Sent: Thursday, October 16, 2003 13:37
>To: Josh Berkus
>Cc: markw osdl org; pgsql-performance postgresql org;
>linux-lvm sistina com; pgsql-admin postgresql org
>Subject: Re: [ADMIN] [PERFORM] backup/restore - another area.
>
>
>On Thu, 16 Oct 2003 10:09:27 -0700
>Josh Berkus <josh agliodbs com> wrote:
>
>> Jeff,
>> 
>> > I left the DB up while doing this.
>> >
>> > Even had a program sitting around committing data to try 
>and corrupt
>> > things. (Which is how I discovered I was doing the snapshot wrong)
>> 
>> Really?   I'm unclear on the method you're using to take the 
>snapshot,
>> then; I seem to have missed a couple posts on this thread.   Want to
>> refresh me?
>> 
>
>I have a 2 disk stripe LVM on /dev/postgres/pgdata/
>
>lvcreate -L4000M -s -n pg_backup /dev/postgres/pgdata
>mount /dev/postgres/pg_backup /pg_backup 
>tar cf - /pg_backup | gzip -1 > /squeegit/mb.backup 
>umount /pg_backup;
>lvremove -f /dev/postgres/pg_backup;
>
>In a nutshell an LVM snapshot is an atomic operation that 
>takes, well, a
>snapshot of hte FS as it was at that instant.  It does not make a 2nd
>copy of the data.   This way you can simply tar up the pgdata directory
>and be happy as the snapshot will not be changing due to db activity.
>
>-- 
>Jeff Trout <jeff jefftrout com>
>http://www.jefftrout.com/
>http://www.stuarthamm.net/
>
>---------------------------(end of 
>broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index 
>scan if your
>      joining column's datatypes do not match
>

Attachment: pgSnapBack3.generic
Description: Binary data


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]