[Spacewalk-list] cleanup-data-bunch takes forever

Paul Robert Marino prmarino1 at gmail.com
Mon Jun 8 17:05:09 UTC 2015


looks like your database isn't tuned correctly or you desperately need
to run a vacuum analyze on it.
check your disks too.
This is most likely a problem with your database servers,
configuration, maintenance, or hardware.


On Mon, Jun 8, 2015 at 11:07 AM, Gerald Vogt <vogt at spamcop.net> wrote:
> Does anyone have an idea what I could do about this?
>
> Does anyone see something similar?
>
> Thanks,
>
> Gerald
> On 28.05.15 06:57, Gerald Vogt wrote:
>> Eventually cleanup-data-bunch finished after 8:40 h. Definitively not
>> suitable to run daily.
>>
>> I don't quite understand why the query runs with two sequential scans
>> and does not use the indexes.
>>
>> Does anyone have an idea how to optimize that? Is that always taking so
>> long? I am using CentOS 6.6, Spacewalk 2.3, PostgreSQL 8.4.20.
>>
>> For now, I'll schedule that task monthly...
>>
>> Thanks,
>>
>> Gerald
>>
>> On 27.05.15 13:39, Gerald Vogt wrote:
>>> Hi!
>>>
>>> It seems cleanup-data-bunch takes a very long time.
>>>
>>> I have noticed that it was in state "INTERRUPTED" and did not run for a
>>> couple of months. I have applied the changes as suggested in
>>>
>>> https://www.redhat.com/archives/spacewalk-list/2015-May/msg00091.html
>>>
>>> and now cleanup-data-bunch does run again. It's however very slow.
>>> Currently, it is already running for 2:15 h and still goes on.
>>>
>>> I can see with pg_stat_activity that is running this query:
>>>
>>> DELETE FROM rhnPackageChangeLogData
>>>          WHERE id NOT IN ( SELECT DISTINCT changelog_data_id FROM
>>> rhnPackageChangeLogRec )
>>>
>>> which seems to keep the database busy for a while:
>>>
>>>> spacewalk=# explain verbose DELETE FROM rhnPackageChangeLogData WHERE id NOT IN ( SELECT DISTINCT changelog_data_id FROM rhnPackageChangeLogRec );
>>>>                                                 QUERY PLAN
>>>> -----------------------------------------------------------------------------------------------------------
>>>>  Seq Scan on rhnpackagechangelogdata  (cost=803563.75..215621093.82 rows=201534 width=6)
>>>>    Output:
>>>>    Filter: (NOT (SubPlan 1))
>>>>    SubPlan 1
>>>>      ->  Materialize  (cost=803563.75..804467.22 rows=64947 width=8)
>>>>            Output: rhnpackagechangelogrec.changelog_data_id
>>>>            ->  Unique  (cost=781080.12..803244.80 rows=64947 width=8)
>>>>                  Output: rhnpackagechangelogrec.changelog_data_id
>>>>                  ->  Sort  (cost=781080.12..792162.46 rows=4432935 width=8)
>>>>                        Output: rhnpackagechangelogrec.changelog_data_id
>>>>                        Sort Key: rhnpackagechangelogrec.changelog_data_id
>>>>                        ->  Seq Scan on rhnpackagechangelogrec  (cost=0.00..109859.35 rows=4432935 width=8)
>>>>                              Output: rhnpackagechangelogrec.changelog_data_id
>>>> (13 rows)
>>>>
>>>> spacewalk=# select count(*) from rhnpackagechangelogrec;
>>>>   count
>>>> ---------
>>>>  5112795
>>>> (1 row)
>>>>
>>>> spacewalk=# select count(*) from rhnpackagechangelogdata;
>>>>  count
>>>> --------
>>>>  401000
>>>> (1 row)
>>>>
>>>
>>> If it is taking this long I guess there is no point scheduling it daily.
>>>
>>> Any thoughts?
>>>
>>> Thanks,
>>>
>>> Gerald
>>>
>>>
>>> _______________________________________________
>>> Spacewalk-list mailing list
>>> Spacewalk-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>>>
>>
>> _______________________________________________
>> Spacewalk-list mailing list
>> Spacewalk-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>>
>
> _______________________________________________
> Spacewalk-list mailing list
> Spacewalk-list at redhat.com
> https://www.redhat.com/mailman/listinfo/spacewalk-list




More information about the Spacewalk-list mailing list