Primary Keys, Segmentation and Query Performance
We are trying to understand whether defining keys on a table will improve the query performance.
We understand that vertica automatically segments table depending on source table, table structure or data
We have two tables (A and both having 70 M records. We are executing below query on that table.
select * from A join B
on A.ky = b.ky
Approach 1. Created tables without keys
1. Query completed in 53 seconds
Approach 2. Created tables with Primary Keys on both tables. (PK on table A /PK on table
1. Without Analyze_statistics, Query Completed in 3 min
2. After executing Analyze_statistics function, Query completed in 55 seconds
Approach 3. Created tables with Primary Key - Foreign Key Relationship(PK-FK on table A /PK on table
1. Without Analyze_statistics, Query Completed in 1 min 20 s
2. After executing Analyze_statistics function, Query completed in 54 seconds
We didn’t find any difference in query performance with all the above approaches. Can you please help us in understanding how segmentation helps in query performance and which is the best path to choose. Also is there any system table which tells how the projections are segmented.