[Spacewalk-list] Postgres spacewalk

'jpazdziora@redhat.com' jpazdziora at redhat.com
Tue Nov 22 15:19:04 UTC 2011


On Mon, Nov 21, 2011 at 10:27:57PM +0000, Parsons, Aron wrote:
> I did some digging into this query because it is one that is very obviously slow when working with real systems.
> 
> It seems that the left join to rhnchecksum is the culprit of this slow query.  Here's the effect on execution time by changing it to an inner join:
> LEFT JOIN: 2973.886 ms
> INNER JOIN: 1.444 ms
> 
> Based on what this query is used for and the columns we're selecting, this seems safe to me; the inner join is just going to exclude rows that don't exist in both, but that should never happen under normal operation.  I have not tested this on Oracle, but have verified that client operations are still operating normally.  The real world effect is that this takes a ~50s 'rhncfg-client diff' on the client down to ~3s.  I tested by updating the query in /usr/share/rhn/server/handlers/config/rhn_config_management.py.
> 
> Jan,
> What's your take on this?

If you have a query

	select from table1
		outer join table 2
		outer join table 3

you cannot replace it with

	select from table1
		outer join table 2
		inner join table 3

-- that is not semantically equivalent.

While your are right that rhnConfigContent.checksum_id is defined as
not null, the same is not true for rhnConfigRevision.config_content_id
-- that value may be null. If that one is null,
rhnConfigContent.checksum_id will be null as well (in the select,
thanks to the outer join), and the select will return nulls from
rhnChecksum as well.

What you could do would be

	select from table1
		outer join (
		select table 2
			inner join table 3
		) as subselect

-- you'd need to check thou if it made some difference in the
execution plan.

-- 
Jan Pazdziora
Principal Software Engineer, Satellite Engineering, Red Hat




More information about the Spacewalk-list mailing list