Options

MERGEJOIN not used where expected

Admittedly, this is not the cleanest data model.  However, I'm struggling to understand why HASH join is used rather than MERGEJOIN here.

 

CREATE TABLE T (COUNTRY, LOCALE, ...)

ORDER BY COUNTRY, LOCALE_ID

SEGMENTED BY HASH(COUNTRY, LOCALE_ID);

 

CREATE TABLE US_ZIPS (ZIP PRIMARY KEY, STATE, ....)

ORDER BY ZIP

UNSEGMENTED ALL NODES;

 

SELECT T.*, R.STATE

FROM T

LEFT JOIN R ON T.COUNTRY = 'USA' AND T.LOCALE=R.ZIP;

 

This variation does give desired results using MERGEJOIN:

SELECT T.*, R.STATE

FROM T

LEFT JOIN R ON AND T.LOCALE=R.ZIP

WHERE T.COUNTRY = 'USA'

UNION ALL

SELECT T.*, NULL

FROM T

WHERE T.COUNTRY<>'USA';

 

 

I feel that understanding why the former doesn't use MERGEJOIN would be good to know for future reference (assuming that's by design). 

 

Comments

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file