Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!
Optimizing DISTANCE using projections
Hi folks,
I am currently using the DISTANCE function in vertica. I'm not getting great performance, so wanted to get your opinion on if this can be optimized. If anyone has optimized DISTANCE using projections before, your opinion would be greatly appreciated.
My Query:
SELECT Date_trunc('week', date), APPROXIMATE_COUNT_DISTINCT(userId) AS distinct_user_count FROM mySchema.userProjection WHERE date BETWEEN '2022-01-01' AND '2022-03-31' AND country = 'US' AND DISTANCE(40.71427, -74.00597, userLatitude, userLongitude) < 160 GROUP BY 1 ORDER BY 1
My Projection:
CREATE PROJECTION mySchema.userProjection ( country ENCODING RLE, date ENCODING ZSTD_FAST_COMP, userLatitude ENCODING ZSTD_FAST_COMP, userLongitude ENCODING ZSTD_FAST_COMP, userId ENCODING ZSTD_FAST_COMP ) AS SELECT myTable.country, myTable.date, myTable.userLatitude, myTable.userLongitude, myTable.userId, FROM mySchema.myTable ORDER BY myTable.country, myTable.userLatitude, myTable.userLongitude SEGMENTED BY hash(myTable.country, myTable.userId, myTable.modified_date) ALL NODES;
The table is partitioned by date. Query currently takes about 8s, I was really aiming for <5s.
My guess is DISTANCE
might not be using the projection correctly.
Appreciate any help!
Tagged:
0
Leave a Comment
Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.