[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Spacewalk-devel] Spacewalk-devel Digest, Vol 45, Issue 4

The left join on rhnChecksumView is horrendously slow, exactly the same thing that was happening with doing config file diffs.  An export of a single channel with 3000 packages takes hours compared to 10 minutes with this patch.  I can post the 'explain analyze' output if you'd like.

I'll brush on my SQL and learn why this is so bad.  In the meantime, can someone explain why this is bad in this case?  Obviously it's not an ideal query, but PostgreSQL deciding to do a sequence scan on rhnChecksumView with 2,000,000+ rows just doesn't work.  Joining the subquery with the same table on a single column seems straightforward enough; I don't see how it's error prone as Michael explained in the previously referenced commit.


-----Original Message-----

Message: 4
Date: Fri, 3 Feb 2012 17:02:31 +0100
From: Jan Pazdziora <jpazdziora redhat com>
To: spacewalk-devel redhat com
Subject: Re: [Spacewalk-devel] Patch review for exportLib.py
Message-ID: <20120203160231 GB6087 redhat com>
Content-Type: text/plain; charset=us-ascii

On Fri, Feb 03, 2012 at 03:55:53PM +0000, Parsons, Aron wrote:
> Can someone verify that this is a sound change?  rhn-satellite-exporter is extremely slow on PostgreSQL when getting the file checksums.  This speeds it up dramatically.
> It's the same issue we ran into with PostgreSQL doing a left join on rhnChecksumView in 'query_client_get_file'.  PostgreSQL does a sequence scan on a table with millions of rows.  I patched it in 1e64dc4546d96c58dd84a88c3c634b2d2ce164a4 but Michael had to fix it in 79e57d3b215340a54e911a073408f0eb7c6afc1f.  I just want to be sure my SQL is proper on this one before committing this time.
> diff --git a/backend/satellite_tools/exporter/exportLib.py b/backend/satellite_t
> index 1646c58..1f265a2 100644
> --- a/backend/satellite_tools/exporter/exportLib.py
> +++ b/backend/satellite_tools/exporter/exportLib.py
> @@ -753,7 +753,12 @@ class _PackageDumper(BaseRowDumper):
>                  c.checksum_type as "checksum-type",
>                  c.checksum, pf.linkto, pf.flags, pf.verifyflags, pf.lang
>              from rhnPackageFile pf
> -            left join rhnChecksumView c
> +            left join
> +              (select c.*
> +               from rhnPackageFile pf
> +                 inner join rhnChecksumView c
> +                   on pf.checksum_id = c.id
> +               where pf.package_id = :package_id) c
>                on pf.checksum_id = c.id,
>                  rhnPackageCapability pc
>              where pf.capability_id = pc.id

I'm affraid you'd get the same

	having a table twice in select is mostly a bug

response on this one. What are you trying to achieve? This query
doesn't even have any lookup_* function.

Jan Pazdziora
Principal Software Engineer, Satellite Engineering, Red Hat


Spacewalk-devel mailing list
Spacewalk-devel redhat com

End of Spacewalk-devel Digest, Vol 45, Issue 4

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]