Best Answers

  • Bryan_HBryan_H Vertica Employee Administrator
    Answer ✓

    EXPLAIN query plan might show a difference in processing. However, I see two factors that generally slow down queries:
    Column is not compressed or encoded to optimize storage ("ZSTD_FAST_COMP" is recommended for varchar)
    Column is not ORDER BY to optimize predicate search ("ORDER BY lac_cell" will improve WHERE clause)
    EXPLAIN may show additional factors that affect run time.

  • SruthiASruthiA Vertica Employee Administrator
    Answer ✓

    @jiahe1224 yes. better to use RLE encoding if deal_date column is a low cardinality column

Answers

  • jiahe1224jiahe1224 Vertica Customer

    @Bryan_H said:
    EXPLAIN query plan might show a difference in processing. However, I see two factors that generally slow down queries:
    Column is not compressed or encoded to optimize storage ("ZSTD_FAST_COMP" is recommended for varchar)
    Column is not ORDER BY to optimize predicate search ("ORDER BY lac_cell" will improve WHERE clause)
    EXPLAIN may show additional factors that affect run time.

    I did not write order by lac_cell when the first table was created, and I wrote order by lac_cell for the next two tables, but the projection shows that the lac_cell of the three tables are all sorted. In addition, for encoding, all super projections all three columns are auto encoded, my deal_date is low cardinality, maybe I should use RLE encoding?
    Thanks again for your answer! This forum is great, I don't have any friends around me who can ask questions.

  • jiahe1224jiahe1224 Vertica Customer

    deal_date column is a low cardinality integer column

  • jiahe1224jiahe1224 Vertica Customer

    Thank you for your answer, I would like to ask again, what encoding is suitable for unsorted char fields with high cardinality? Actually the field is phone number

  • Bryan_HBryan_H Vertica Employee Administrator

    ZSTD_FAST_COMP is recommended for unsorted types.

  • jiahe1224jiahe1224 Vertica Customer

    I would like to ask you another question, when do I need to put the column in the order by? Suppose I have a query like this,
    SELECT col3
    FROM table_tmp
    WHERE col1 = value1
    AND col2 IN (....);
    col1 is the column used for partitioning, and table_tmp only reserves one partition(In this case, is col1=value1 unnecessary in the where statement? Because when I insert the new partition data, the original partition will be deleted immediately. These are automatically executed by bash, and the query request is unlikely to encounter the situation that there are two partitions in the table.). When I manually create a projection, should I put col1 and col2 or only col2 in the order by statement?

  • Bryan_HBryan_H Vertica Employee Administrator

    It might not be necessary in all cases, however I would include the partition column in the sort order because this will help the optimizer when planning operations across partitions, for example date ranges or multiple predicates, and will also help in the case where the partition column is used for JOIN.

  • jiahe1224jiahe1224 Vertica Customer
    edited November 2022

    @Bryan_H said:

    In my specific business scenario, if there are multiple partitions in the table, col1=value1 needs to be added, because only the data of the latest partition is required. Generally, it is better to add the partition column to the order by statement of projection, but only one partition is reserved in my table, is it still necessary?

    SELECT col3
    FROM table_tmp
    WHERE col1 = value1
    AND col2 IN (....);

    Sorting is not required in actual query, but there must be sorting in projection. As you said, the value of col1 does not need to be limited, only the value of col2 needs to be limited

  • VValdarVValdar Vertica Employee Employee

    Projection sorting also helps with other filters, join and group by operations.
    Considering your table is partitioned, did you take the chance to look at our range partition projection feature?
    It builds the projection only on the last XX partitions, where you define the XX, could be a relative time from now.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file