ANALYZE_STATISTICS_PARTITION
I declre this partition on table:
ALTER TABLE public.T2 PARTITION BY date(ts at time zone 'UTC')::date GROUP BY ( CASE WHEN DATEDIFF('MONTH', date(ts at time zone 'UTC')::date, NOW()::TIMESTAMPTZ(6)) >= 12 THEN DATE_TRUNC('MONTH', date(ts at time zone 'UTC')::date) ELSE date(ts at time zone 'UTC')::date END);
During the day Vertica inserts to the table, data with time ts.
At the end of each business day I will call ANALYZE_STATISTICS_PARTITION and collect statistics on all data
what is the best query to run:SELECT ANALYZE_STATISTICS_PARTITION('public.T2', CURRENT_DATE::VARCHAR(10), CURRENT_DATE::VARCHAR(10));
or SELECT ANALYZE_STATISTICS_PARTITION('public.T2', minvalue, maxvalue);
when max and min are :
select projection_name, min(partition_key), max(partition_key) from partitions group by projection_name
Best Answers
-
Optimizer is not going to do partition pruning, if partition is on expression.
There is a performance benefits on adding date column and partitioning on column directly.
That, of course, come at price of increased license usage for data size.1 -
Vertica_Curtis Employee
I'm not sure Vertica cares which command you run. It's going to identify applicable partitions, and run stats on those partitions. So, I suspect they are functionally equivalent.
I'm not sure what you're argument is there, Sergey - the optimizer will do partition pruning if you're doing a filter on the partitioned expression. The optimizer even does partition pruning even if it's not. If I partition on "state", and do a query lookup by zip-code, the optimizer can determine that certain zip codes don't exist in certain states, and can prune those states from its target set. It uses statistics min/max values at the ROS level to know what to prune.
1