Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Optimizing DISTANCE using projections

Hi folks,

I am currently using the DISTANCE function in vertica. I'm not getting great performance, so wanted to get your opinion on if this can be optimized. If anyone has optimized DISTANCE using projections before, your opinion would be greatly appreciated.

My Query:

SELECT Date_trunc('week', date),
      APPROXIMATE_COUNT_DISTINCT(userId)  AS distinct_user_count
FROM   mySchema.userProjection
WHERE date BETWEEN '2022-01-01' AND '2022-03-31'
       AND country = 'US'
       AND DISTANCE(40.71427, -74.00597, userLatitude, userLongitude) < 160
GROUP BY 1 ORDER BY 1

My Projection:

CREATE PROJECTION mySchema.userProjection
    (
     country ENCODING RLE,
     date ENCODING ZSTD_FAST_COMP,
     userLatitude ENCODING ZSTD_FAST_COMP,
     userLongitude ENCODING ZSTD_FAST_COMP,
     userId ENCODING ZSTD_FAST_COMP
     )
    AS
        SELECT myTable.country,
               myTable.date,
               myTable.userLatitude,
               myTable.userLongitude,
               myTable.userId,
        FROM mySchema.myTable
        ORDER BY myTable.country,
                 myTable.userLatitude,
                 myTable.userLongitude
    SEGMENTED BY hash(myTable.country, myTable.userId, myTable.modified_date) ALL NODES;

The table is partitioned by date. Query currently takes about 8s, I was really aiming for <5s.

My guess is DISTANCE might not be using the projection correctly.

Appreciate any help!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.