Optimizing Vertica Schemas for a Low Latency Application

mferrall · July 2022

What best practices are recommended to optimize Vertica for a UI dashboard? We use Vertica to populate a Reporting UI for our clients, it runs about 1 million queries per day on tables with rows in the hundreds of billions.

Currently we use 3 fact tables. A common use case is unioning a subset of columns from the 3 tables into a common view.

Ideally, we would like to have all of our UI based queries to take <500ms for a month of data (we store in an hourly format).

What approaches and schemas should I consider at a high level? Do flattened tables fit this type of use case? Are there other schema level optimizations I should consider? An obvious one is a new table eliminates the need for unioning. Any other approaches that folks here use, or is Vertica not even well suited to this type of use case? Thanks!

mosheg · July 2022

1) Flattened tables will run faster joins and will save the need to code joins, but not unions.

2) If the tables you need to run the union on, have partitions and have the same schema structure, you can do COPY_PARTITIONS_TO_TABLE from one table to another. This lightweight partition copy increases performance by initially sharing the same storage between two tables.

3) If possible prefer to run calculations once in load time by adding aggregations to partition base projections or LAPs to save the need to run repeated calculations or full scan in queries run time.

4) UNION ALL performs better than UNION since the database server does not have to eliminate duplicate rows which can be costly.

5) If without UNION your query works faster, if possible prefer using OR instead of UNION nor UNION ALL.

6) Queries with very large result set from multiple nodes can be slow.
Even when redirecting output to /dev/null it still route all data through one channel (query initiator).
A possible alternative to “leave” data in the nodes where it was originated is to use the following q_result table.
The scope of the “ksafe 0” at the end is to avoid buddy projection creation.
In addition, if your UNION queries run faster as separated individual queries, create a temp table as shown below to hold the result-set of the individual queries and then select from the temp table.

create local temporary table <q_result> on commit preserve rows as 
    <YOUR SELECT HERE>
ksafe 0 ;

7) Check if you have enough infrastructure to very your concurrent queries will not be queued.
If you have one million queries per 24h and the load is spread evenly between the hours, you need memory to satisfy 11.6 queries per Sec. plus the ETL concurrent activity.
To confirm the above, and see on average how many queries run per hour and what is the average elapsed time,
Do the following check in vsql:

with -- chart scale = 30
res1 as (select
    time_slice(start_timestamp, 1,'hour') as hour_slice,
    count(distinct STATEMENT_ID) as statements_count,
    avg(request_duration_ms) as avg_request_duration_ms
from
    query_requests
where
    request_type='QUERY' and
    success AND
      ( NOW() - start_timestamp) < '24 hour'
group by 1
order by 1),
norm as (select (max(avg_request_duration_ms)/30)::int as norm_factor from res1)
select hour_slice, statements_count, to_char(avg_request_duration_ms,'999,999,999,999') as avg_request_duration_ms,
       repeat('*',(avg_request_duration_ms / norm_factor +1)::INT) as avg_request_duration_ms_chart
from res1, norm
order by 1;

8) Then check how much memory your query need (per node) use something like the following:

select transaction_id,statement_id, max(memory_kb) 
from dc_resource_acquisitions 
where transaction_id=<xxxx> and 
      statement_id=<xxxx> and 
      node_name=’node1’ 
group by transaction_id,statement_id;

Request types: 
Reserve         = Query gets initially based on Planned Concurrency
AcquireAdditional   = Sums up previous value with additional memory requested (values are continuously increasing)
Acquire         = memory allotted for optimizer planning

mferrall · July 2022

Hi @mosheg, this is very helpful! I think I understand all of these, except for point 6. Are you saying that for some query, I can:

Select data into a temp table
That temp table will be local on every node
Instead of routing through the initiator, the result set will come from multiple nodes?

Do you know of any documentation about this approach, either from Vertica or other uses? Or am I misunderstanding what you mean?

mosheg · July 2022

Yes.
Another temp table performance advantage is that it requires less locks.
The approach in (6) is not mentioned in the documentation, I've learned it from Maurizio.

mferrall · July 2022

Ok, we have a training with him on Tuesday, I'll ask for more details then, thank you!

We're Moving!

Create My New Community Account Now

Optimizing Vertica Schemas for a Low Latency Application

Comments

Leave a Comment