Select by date range and projections for best performance-
Hi,
I'm doing typical join of two long tables by composite primary keys and filtering a range by non-key date field.
CREATE TABLE operations (pk1 int, pk2 int, date date, somedata int, PRIMARY KEY(pk1, pk2))
CREATE TABLE additional (pk1 int, pk2 int, somedata int, PRIMARY KEY(pk1, pk2))
SELECT
op.pk1
, op.pk2
, op.date
, op.somedata
, ad.somedata
FROM operations AS op
JOIN additional AS ad
ON op.pk1 = ad.pk1
AND op.pk2 = ad.pk2
WHERE op.date BETWEEN '...' AND '...'
There are two points to optimize:
- join by primary key and
- filtering by date.
What projections I should create to achieve the best performance in this case?
0
Answers
I'd first run the query through database designer.
See:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/PhysicalSchema/DesigningAPhysicalSchema.htm
If the design produced by DBD isn't optimal, at-least you'll have write column encodings.
These blog posts can help you manually re-design the projections:
https://www.vertica.com/kb/Redesigning-Projections-for-Query-Optimization/Content/BestPractices/Redesigning-Projections-for-Query-Optimization.htm
https://www.vertica.com/blog/tech-support-series-optimizing-projectionsba-p223411/