[Spacewalk-list] API calls for new hosts
Jeremy Maes
jma at schaubroeck.be
Thu Apr 11 11:04:57 UTC 2013
Op 11/04/2013 10:25, Pierre Casenove schreef:
> Hello,
> I have a setup with postgresql 8.4 and 150 clients.
> My DB dump (using pg_dump, with -Fc option) is about 1.1 GB.
> I still haven't noticed slower operations.
> Should I planify VACUUM ANALYZE operations once a month? Could it
> lower the size of the dump?
> Would the script be like:
> spacewalk-service stop
> su --command "psql -c 'VACUUM VERBOSE ANALYZE;' -d spaceschema" postgres
> spacewalk-service start
> ?
>
> Thanks in advance for your help,
>
> Pierre
>
>
On CentOS 6 there's also a command /usr/bin/*vacuumdb *available with a
bunch of options that basically does that without you having the know
the exact SQL commands. (for 8.4 at least, don't know about 9+)
You can run the basic (lazy) vacuum while spacewalk is running, I run it
daily in my database dump script before the dump itself.
(/usr/bin/vacuumdb --analyze -h localhost -U postgres spaceschema, needs
a .pgpass file to work if you don't add the password to the command) My
gzipped database is about 300M now for ~50 clients and ~15.500 packages
in repos. Can't say if it has a big impact on size as I configured it
this way when I set up the database and I've no further postgres experience.
For the full vacuum you'd stop spacewalk and add --full. (Or add "FULL"
to the sql command you mentioned)
Regards,
Jeremy
>
> 2013/4/11 Anton Pritchard-Meaker
> <anton.pritchard-meaker at kit-digital.com
> <mailto:anton.pritchard-meaker at kit-digital.com>>
>
> Thanks I really appreciate this, I'll definitely look into these
> actions. Downtime is not an issue for my Spacewalk implementation.
>
> I'm pretty new to PostreSQL, so I was completely unaware of all of
> maintenance tools available which actually sound quite necessary.
>
> Anton Pritchard-Meaker | Unix Engineer
>
> ------------------------------------------------------------------------
> *From:* spacewalk-list-bounces at redhat.com
> <mailto:spacewalk-list-bounces at redhat.com>
> [spacewalk-list-bounces at redhat.com
> <mailto:spacewalk-list-bounces at redhat.com>] on behalf of Paul
> Robert Marino [prmarino1 at gmail.com <mailto:prmarino1 at gmail.com>]
> *Sent:* 10 April 2013 22:39
>
> *To:* spacewalk-list at redhat.com <mailto:spacewalk-list at redhat.com>
> *Subject:* Re: [Spacewalk-list] API calls for new hosts
>
> In PostgreSQL 8.x auto vacuuming was first being introduced and
> the default settings weren't Ideal. Further more it wasn't a
> complete implementation so standard vacuuming is still necessary
> in 8.x
>
> In PostgreSQL 9.x auto vacuuming matured quite a bit a and manual
> vacuuming is needed far less often, but still a good idea to do
> occasionally.
>
> There are two kinds of vacuuming a lazy vacuum and a full vacuum.
> There are also two other table maintenance task which need to be
> done periodically as well.
>
> A lazy vacuum does not require an exclusive table lock so in many
> cases may be executed while the database is actively in use;
> however there tend to be tables in spacewalk that constantly have
> lock which may hang the process so its best to schedule occasional
> downtime for this operation. The good new is if you do it on a
> regular basis a lazy vacuum is quick. In addition in PostgreSQL
> 9.x the auto vacuum process fairly effectively opportunistically
> tries to do this for you as needed with as little impact as possible.
>
> A full vacuum requires an exclusive table lock but does a few
> things a lazy vacuum can't. The first thing it does is it flattens
> the MVCC ( MVCC is version control for rows it provides rollback
> capabilities and allows long running queries to complete without
> the results being tainted by data added or deleted after the long
> running query was started). the MVCC needs to be occasionally
> flattened on high volume tables to prevent the version numbers
> from wrapping around (which can potentially cause a sort of data
> corruption); however this is rare and may databases run for years
> without having to worry about this. The major advantage is that a
> Full vacuum can reclaim all of the disk space being used by old
> row versions. the lazy vacuum can only mark the space into a pool
> for recycling (Oracle had the same thing literally called it the
> trash bin last time I worked with it) unless they are at the end
> of the last table file, also in PostgreSQL 8.x the developers
> realized the maximum size limit of recycle pool was too small for
> modern databases so it was increased significantly in 9.x.
>
>
> NOTE: a dump and load has the same effect as a full vacuum
>
> ANALYZE
>
> Analyzing updates your table statistics. the statistics are used
> by the query planner. what the query planner does is it takes the
> queries you run on the tables and re-optimizes them based on the
> table structure, the fragmentation level of the table, the types
> of sorts, filters the query has, the indexes available and how
> efficient they, are more. the statistics tell the planer how
> efficient different types of operations are based on a series of
> test queries it executed the last time they were updated.
> Analyzing is a non blocking operation however just like lazy
> vacuuming it can get hung up by other queries from spacewalk
> indefinitely, so its best to do it occasionally with spacewalk
> offline.
> Analyzing can be done as part of a vacuum or independently. If
> done independently you can control it to the level where you can
> even tell it just to analyze a specific column; however its
> usually best to do an analyze with a vacuum for most people, only
> very experienced DBAs should consider doing more advanced versions
> of the ANALYZE command .
>
> NOTE: a dump and load does not do an ANALYZE on the tables.
>
>
> REINDEX
>
> Vacuuming cleans up the table but not cleanup, defragment, or
> resort the indexes so it is important to at least once a year do a
> REINDEX on standard indexes to maintain performance, and more
> often for ordered indexes. A REINDEX can not be done as part of a
> vacuum it is an independent operation. A REINDEX is an exclusive
> locking operation and as such can not be done at the same time as
> any thing else is accessing the table, as such spacewalk should be
> offline during this operation. reindexing is the slowest
> maintenance operation and should only be done after a full vacuum.
> You should also do an ANALYZE after a REINDEX.
>
> NOTE: a dump and load has the same effect as a REINDEX.
>
>
> All of these operations are at the table level except the ANALYZE
> which may be done down to the column level. a REINDEX can also be
> done in the specific index level I think; however its usually most
> efficient to do the whole table at once unless you have an
> unusually large table.
>
> Finally there are command line tools for vaccum and reindex that
> can operate by sequentially cycling through the tables in the
> database; however if your disks ram and CPU can handle it you can
> run these operations in parallel on different tables to speed
> things up via multiple SQL connections.
>
**** DISCLAIMER ****
http://www.schaubroeck.be/maildisclaimer.htm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20130411/49111505/attachment.htm>
More information about the Spacewalk-list
mailing list