[Spacewalk-list] cleanup-data-bunch takes forever

Gerald Vogt vogt at spamcop.net
Wed May 27 11:39:51 UTC 2015


Hi!

It seems cleanup-data-bunch takes a very long time.

I have noticed that it was in state "INTERRUPTED" and did not run for a
couple of months. I have applied the changes as suggested in

https://www.redhat.com/archives/spacewalk-list/2015-May/msg00091.html

and now cleanup-data-bunch does run again. It's however very slow.
Currently, it is already running for 2:15 h and still goes on.

I can see with pg_stat_activity that is running this query:

DELETE FROM rhnPackageChangeLogData
         WHERE id NOT IN ( SELECT DISTINCT changelog_data_id FROM
rhnPackageChangeLogRec )

which seems to keep the database busy for a while:

> spacewalk=# explain verbose DELETE FROM rhnPackageChangeLogData WHERE id NOT IN ( SELECT DISTINCT changelog_data_id FROM rhnPackageChangeLogRec );
>                                                 QUERY PLAN                                                 
> -----------------------------------------------------------------------------------------------------------
>  Seq Scan on rhnpackagechangelogdata  (cost=803563.75..215621093.82 rows=201534 width=6)
>    Output: 
>    Filter: (NOT (SubPlan 1))
>    SubPlan 1
>      ->  Materialize  (cost=803563.75..804467.22 rows=64947 width=8)
>            Output: rhnpackagechangelogrec.changelog_data_id
>            ->  Unique  (cost=781080.12..803244.80 rows=64947 width=8)
>                  Output: rhnpackagechangelogrec.changelog_data_id
>                  ->  Sort  (cost=781080.12..792162.46 rows=4432935 width=8)
>                        Output: rhnpackagechangelogrec.changelog_data_id
>                        Sort Key: rhnpackagechangelogrec.changelog_data_id
>                        ->  Seq Scan on rhnpackagechangelogrec  (cost=0.00..109859.35 rows=4432935 width=8)
>                              Output: rhnpackagechangelogrec.changelog_data_id
> (13 rows)
> 
> spacewalk=# select count(*) from rhnpackagechangelogrec;
>   count  
> ---------
>  5112795
> (1 row)
> 
> spacewalk=# select count(*) from rhnpackagechangelogdata;
>  count  
> --------
>  401000
> (1 row)
> 

If it is taking this long I guess there is no point scheduling it daily.

Any thoughts?

Thanks,

Gerald





More information about the Spacewalk-list mailing list