Optimizing a query with several joins

If you have a query that has 6 or 7 joins to a fact table, which join is best to optimize it? Is there a way to get a merge join for all the joins?

Optimizing Query Performance and Resource Pool Tuning
@rbankula @bat

Answers

  • ChuckBChuckB Vertica Employee Employee

    Unless the join is always on the same key, you probably can't get merge join (or even cosegmented join) for all.

    The principle here (which goes back to the Kimball book, if not further) is to bring the restrictions to the biggest (fact) table, then enrich the data against supporting dimensions.

    So, you didn't talk about your schema, and whether it is star, "barbell" (multiple large tables), snowflake, or something else. But cosegment / cosort your biggest tables, replicate the small ones so they join to that locally, and then the optimization question only pertains to really large dimensions.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file