Why JOIN implies in MERGE_CONVERTED_TO_UNION

Hi!

I have three tables, each with 175 mi of lines each one, two tables has two columns (col_id_1 and col_id_2) that together forms the PK and own has only one PK column .(col_id_1).


I want to create a table that is this three tables joined, all tables are sorted by the same KEY (col_id_1 and col_id_2), so I write my SQL as follows:


CREATE TABLE all_together AS

SELECT * FROM TABLE1 T1

JOIN TABLE2 T2 ON T1.col_id_1 = T2.col_id_1 AND T1.col_id_2 = T2.col_id_1

JOIN TABLE2 T3 ON T1.col_id_1 = T3.col_id_1


When I run an EXPLAIN in this query Vertica does not shows a SORT phase, instead of the information is that the inputs are presorted (due to the projections ordered by col_id_1/col_id_2 and col_id_1.


But the query execution gave me an execution event with this description: "Merge converted to union, followed by sort." and event_type "MERGE_CONVERTED_TO_UNION" meaning a SORT on T1


Why vertica are performing a SORT ?


Thanks in advance,

André

Comments

  • Abhishek_RanaAbhishek_Rana Vertica Employee Employee

    MERGE_CONVERTED_TO_UNION is an informational message found in query events. This operation is an internal decision of optimizer,whenever it finds a STORAGE MERGE operation after a MERGE or HASH join in a query plan execution. Every time there is a StorageMerge right on top of a Join (either MergeJoin or HashJoin) in a plan, we convert that StorageMerge into StorageUinon + Sort in LocalPlan. The reason we do that because StorageMerge and Join are co-routines and can case deadlock if their buffers are full and waiting for each other.
    StorageMerge operation makes sure that all columns data have been materialized & there will not be any further materialization of data is needed in execution. If we can force an early materialization of data in a qery plan execution ,we can avoid this StorageMerge operation after a Join & hence in-turn no occurrence of MERGE_CONVERTED_TO_UNION followed by a SORT.

  • ChuckBChuckB Vertica Employee Employee

    You can use early materialized joins to get away from this. Using early materialized joins is a good idea anyway, and you should consider making it the default for new deployments.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file