[Spacewalk-list] API calls for new hosts

Pierre Casenove pcasenove at gmail.com
Thu Apr 11 08:25:19 UTC 2013


Hello,
I have a setup with postgresql 8.4 and 150 clients.
My DB dump (using pg_dump, with -Fc option) is about 1.1 GB.
I still haven't noticed slower operations.
Should I planify VACUUM ANALYZE operations once a month? Could it lower the
size of the dump?
Would the script be like:
spacewalk-service stop
su --command "psql -c 'VACUUM VERBOSE ANALYZE;' -d spaceschema" postgres
spacewalk-service start
?

Thanks in advance for your help,

Pierre



2013/4/11 Anton Pritchard-Meaker <anton.pritchard-meaker at kit-digital.com>

>  Thanks I really appreciate this, I'll definitely look into these
> actions. Downtime is not an issue for my Spacewalk implementation.
>
>
>
> I'm pretty new to PostreSQL, so I was completely unaware of all of
> maintenance tools available which actually sound quite necessary.
>
>
>
>
>  Anton Pritchard-Meaker | Unix Engineer
>
>   ------------------------------
> *From:* spacewalk-list-bounces at redhat.com [
> spacewalk-list-bounces at redhat.com] on behalf of Paul Robert Marino [
> prmarino1 at gmail.com]
> *Sent:* 10 April 2013 22:39
>
> *To:* spacewalk-list at redhat.com
> *Subject:* Re: [Spacewalk-list] API calls for new hosts
>
>     In PostgreSQL 8.x auto vacuuming was first being introduced and the
> default settings weren't Ideal. Further more it wasn't a complete
> implementation so standard vacuuming is still necessary in 8.x
>
>  In PostgreSQL 9.x  auto vacuuming matured quite a bit a and manual
> vacuuming is needed far less often, but still a good idea to do
> occasionally.
>
>  There are two kinds of vacuuming a lazy vacuum and a full vacuum.
>  There are also two other table maintenance task which need to be done
> periodically as well.
>
> A lazy vacuum does not require an exclusive table lock so in many cases
> may be executed while the database is actively in use; however there tend
> to be tables in spacewalk that constantly have lock which may hang the
> process so its best to schedule occasional downtime for this operation. The
> good new is if you do it on a regular basis a lazy vacuum is quick. In
> addition in PostgreSQL 9.x the auto vacuum process fairly effectively
> opportunistically tries to do this for you as needed with as little impact
> as possible.
>
>  A full vacuum requires an exclusive table lock but does a few things a
> lazy vacuum can't. The first thing it does is it flattens the MVCC ( MVCC
> is version control for rows it provides rollback capabilities and allows
> long running queries to complete without the results being tainted by data
> added or deleted after the long running query was started). the MVCC needs
> to be occasionally flattened on high volume tables to prevent the version
> numbers from wrapping around (which can potentially cause a sort of data
> corruption); however this is rare and may databases run for years without
> having to worry about this. The major advantage is that a Full vacuum can
> reclaim all of the disk space being used by old row versions. the lazy
> vacuum can only mark the space into a pool for recycling (Oracle had the
> same thing literally called it the trash bin last time I worked with it)
> unless they are at the end of the last table file, also in PostgreSQL 8.x
> the developers realized the maximum size limit of recycle pool was too
> small for modern databases so it was increased significantly in 9.x.
>
>
>  NOTE: a dump and load has the same effect as a full vacuum
>
>  ANALYZE
>
>  Analyzing updates your table statistics. the statistics are used by the
> query planner. what the query planner does is it takes the queries you run
> on the tables and re-optimizes them based on the table structure, the
> fragmentation level of the table, the types of sorts, filters the query
> has, the indexes available and how efficient they, are more. the statistics
> tell the planer how efficient different types of operations are based on a
> series of test queries it executed the last time they were updated.
>  Analyzing is a non blocking operation however just like lazy vacuuming
> it can get hung up by other queries from spacewalk indefinitely, so its
> best to do it occasionally with spacewalk offline.
>  Analyzing can be done as part of a vacuum or independently. If done
> independently you can control it to the level where you can even tell it
> just to analyze a specific column; however its usually best to do an
> analyze with a vacuum for most people, only very experienced DBAs should
> consider doing more advanced versions of the ANALYZE command .
>
> NOTE: a dump and load does not do an ANALYZE on the tables.
>
>
>  REINDEX
>
>  Vacuuming cleans up the table but not cleanup, defragment, or resort the
> indexes so it is important to at least once a year do a REINDEX on standard
> indexes to maintain performance, and more often for ordered indexes. A
> REINDEX can not be done as part of a vacuum it is an independent operation.
> A REINDEX is an exclusive locking operation and as such can not be done at
> the same time as any thing else is accessing the table, as such spacewalk
> should be offline during this operation. reindexing is the slowest
> maintenance operation and should only be done after a full vacuum. You
> should also do an ANALYZE after a REINDEX.
>
>  NOTE: a dump and load has the same effect as a REINDEX.
>
>
>  All of these operations are at the table level except the ANALYZE which
> may be done down to the column level. a REINDEX can also be done in the
> specific index level I think; however its usually most efficient to do the
> whole table at once unless you have an unusually large table.
>
>  Finally there are command line tools for vaccum and reindex that can
> operate by sequentially cycling through the tables in the database; however
> if your disks ram and CPU can handle it you can run these operations in
> parallel on different tables to speed things up via multiple SQL
> connections.
>
>
>
>
>
>
>
>
>
>
>
> On Wed, Apr 10, 2013 at 3:20 PM, Jon Miller <jonebird at gmail.com> wrote:
>
>> Perhaps routine Postgres maintenance was missing? I'm not proclaiming to
>> be a Postgresql expert but do recall that periodic vacuuming[1] of the
>> database is required / recommended? Your act of dropping and recreating
>> sounds like side stepping what could have been accomplished via
>> maintenance.
>>
>>  [1]: http://wiki.postgresql.org/wiki/VACUUM_FULL
>>
>>  -- Jon Miller
>>
>>
>>  On Wed, Apr 10, 2013 at 8:45 AM, Anton Pritchard-Meaker <
>> anton.pritchard-meaker at kit-digital.com> wrote:
>>
>>>   I managed to fix this by exporting the database, dropping it in
>>> postgresql, re-creating it and then re-importing. Performance seems much
>>> better on the script too. Call times have halved!****
>>>
>>> ****
>>>
>>> *From:* spacewalk-list-bounces at redhat.com [mailto:
>>> spacewalk-list-bounces at redhat.com] *On Behalf Of *Anton Pritchard-Meaker
>>> *Sent:* 10 April 2013 15:17
>>>
>>> *To:* spacewalk-list at redhat.com
>>> *Subject:* Re: [Spacewalk-list] API calls for new hosts****
>>>
>>>   ****
>>>
>>> This script works nicely for existing hosts, the problem only occurs
>>> when I subscribe a new host and run the script/call. As an example, an
>>> existing host with 36 updates available takes 4.7 seconds to get a result
>>> from Spacewalk whereas the new host below is taking 406.****
>>>
>>> ****
>>>
>>> *From:* spacewalk-list-bounces at redhat.com [
>>> mailto:spacewalk-list-bounces at redhat.com<spacewalk-list-bounces at redhat.com>]
>>> *On Behalf Of *Paul Robert Marino
>>> *Sent:* 10 April 2013 15:03
>>> *To:* spacewalk-list at redhat.com
>>> *Subject:* Re: [Spacewalk-list] API calls for new hosts****
>>>
>>> ****
>>>
>>> It means your client timed out the connection.
>>> I assume this is Perl if so than you need to look at LWPs documentation
>>> to tune the time out parameter.
>>>
>>> ****
>>>
>>> -- Sent from my HP Pre3****
>>>
>>> ****
>>>  ------------------------------
>>>
>>> On Apr 10, 2013 9:52 AM, Anton Pritchard-Meaker <
>>> anton.pritchard-meaker at kit-digital.com> wrote: ****
>>>
>>> Not too sure why, but a fresh request has generated the following log
>>> entries, but still no output and a 500 timeout returned:****
>>>
>>> ****
>>>
>>> Api logs:****
>>>
>>> ****
>>>
>>> [2013-04-10 14:20:02,951] INFO  - REQUESTED FROM: 192.168.131.146 CALL:
>>> system.listLatestUpgradablePackages(4267x7b6a0781772f903417626f29664317c0,
>>> 1000010137) CALLER: (******) TIME: 406.117 seconds****
>>>
>>> ****
>>>
>>> Tomcat:****
>>>
>>> ****
>>>
>>> 10-Apr-2013 14:20:02 org.apache.jk.core.MsgContext action****
>>>
>>> WARNING: Error sending end packet****
>>>
>>> java.net.SocketException: Broken pipe****
>>>
>>>         at java.net.SocketOutputStream.socketWrite0(Native Method)****
>>>
>>>         at
>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)****
>>>
>>>         at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
>>> ****
>>>
>>>         at
>>> org.apache.jk.common.ChannelSocket.send(ChannelSocket.java:538)****
>>>
>>>         at
>>> org.apache.jk.common.JkInputStream.endMessage(JkInputStream.java:127)***
>>> *
>>>
>>>         at org.apache.jk.core.MsgContext.action(MsgContext.java:302)****
>>>
>>>         at org.apache.coyote.Response.action(Response.java:183)****
>>>
>>>         at org.apache.coyote.Response.finish(Response.java:305)****
>>>
>>>         at
>>> org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:205)***
>>> *
>>>
>>>         at
>>> org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)****
>>>
>>>         at
>>> org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:775)****
>>>
>>>         at
>>> org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:704)
>>> ****
>>>
>>>         at
>>> org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:897)
>>> ****
>>>
>>>         at
>>> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
>>> ****
>>>
>>>         at java.lang.Thread.run(Thread.java:679)****
>>>
>>> 10-Apr-2013 14:20:02 org.apache.jk.common.ChannelSocket processConnection
>>> ****
>>>
>>> WARNING: processCallbacks status 2****
>>>
>>> ****
>>>
>>> Are there any other logs I could have a look at?****
>>>
>>> ****
>>>
>>> ****
>>>
>>> *From:* Anton Pritchard-Meaker
>>> *Sent:* 09 April 2013 16:42
>>> *To:* spacewalk-list at redhat.com
>>> *Subject:* API calls for new hosts****
>>>
>>> ****
>>>
>>> Hi,****
>>>
>>> ****
>>>
>>> This one was happening in 1.8 and 1.9 on my RHEL5 based Spacewalk
>>> installation. I have a script to prepare update details for audit that
>>> works nicely with hosts already registered, but newly registered ones don’t
>>> work – returns a 500 timeout:****
>>>
>>> ****
>>>
>>> my $systems = $client->call('system.searchByName', $session, $search); *
>>> ***
>>>
>>> ****
>>>
>>> returns the name ok:****
>>>
>>> ****
>>>
>>> [2013-04-09 16:23:52,714] INFO  - REQUESTED FROM: 192.168.131.146 CALL:
>>> system.searchByName(4251x1ef1784c1d24abde4de1b183305f7458, *******) CALLER:
>>> (******) TIME: 0.351 seconds****
>>>
>>> ****
>>>
>>> my $kernel = $client->call('system.getRunningKernel', $session,
>>> $system->{'id'});****
>>>
>>> ****
>>>
>>> returns the kernel ok:****
>>>
>>> ****
>>>
>>> [2013-04-09 16:23:52,738] INFO  - REQUESTED FROM: 192.168.131.146 CALL:
>>> system.getRunningKernel(4251x1ef1784c1d24abde4de1b183305f7458, 1000010137)
>>> CALLER: (******) TIME: 0.014 seconds****
>>>
>>> ****
>>>
>>> The next call is ****
>>>
>>> ****
>>>
>>> my $packages = $client->call('system.listLatestUpgradablePackages',
>>> $session, $system->{'id'});****
>>>
>>> ****
>>>
>>> At this point I get a “500 read timeout”.****
>>>
>>> ****
>>>
>>> All details are returning fine via the GUI. The script is here -
>>> http://hastebin.com/tesovipoki.pl ****
>>>
>>> ****
>>>
>>> Any suggestions would be great – I don’t get anything in the tomcat logs
>>> unfortunately.****
>>>
>>> ****
>>>
>>> Cheers, ****
>>>
>>> ****
>>>
>>> *Anton Pritchard-Meaker* | Unix Engineer****
>>>
>>> *KIT digital** *| York | www.kitd.com  |  The Future of Television****
>>>
>>> ****
>>>
>>>  _______________________________________________
>>> Spacewalk-list mailing list
>>> Spacewalk-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>>>
>>
>>
>> _______________________________________________
>> Spacewalk-list mailing list
>> Spacewalk-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>>
>
>
> _______________________________________________
> Spacewalk-list mailing list
> Spacewalk-list at redhat.com
> https://www.redhat.com/mailman/listinfo/spacewalk-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20130411/40cde9cb/attachment.htm>


More information about the Spacewalk-list mailing list