Multiple projection Hash key precedence

sknairsknair Vertica Customer

If the segmented hash keys for 2 projections for a table are defined like - hash_key1(col1) and hash_key2(col2,col1,col3), which one will be used for segmentation? Projection-1 is a super projection and projection-2 is a key constrained projection.

Best Answer

  • Vertica_CurtisVertica_Curtis Employee
    Answer ✓

    Each projection will be segmented according to its own segmentation hash key. In this example, you'd have two projections segmented completely different from one another. There are cases where that kind of thing makes sense, but generally I think it would be unnecessary to do something like that.

    For example, let's say I have "sales" and "customers". I could segment customers by "customer_id" and I could segment sales by "sales_id". Those are probably good choices since they are likely primary keys, and will guarantee even data distribution. I could also choose to have a projection on the sales table segmented by "customer_id", since joining the two together would result in a data resegmentation at query time, since all the sales for a given customer are evenly distributed across all the nodes. Having a projection on sales segmented by "customer_id" aligns those sales by customer to facilitate ease in joining customer to sales.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file