Count(1) over() recordCount slows down performance. Is there any alternative for this?
I am using count(1) over() recordCount to get the total number of records returned by the query and this appends along with the result. My query is something like this
select count(1) over() recordCount ,a.column1 ,b.column2 from table_1 a left join table_2 b on a.column_1 = b.column_1
With this clause place in a query the execution time is ~31 secs
But when i remove this clause, the execution time is ~5 secs.
Is there any alternative way to get the count of number of records returned by a query.
Something like the ROWCOUNT function in Oracle.
Thanks,
pujesh
1
Answers
Analytic functions with an empty PARTITION BY clause will cause all records to be transferred to the initiator node. (Vertica documentation warns about this somewhere.)
You can try doing an EXPLAIN plan with the empty OVER() clause compared to one with a constant partition expression, such as OVER(PARTITION BY 1). You will notice in the first case, in the ANALYTIC path there will be a step "Group network transfer to initiator", and it will be absent in the second case. The cost should also appear lower in the second case. I tested this with a join similar to yours, with wildly varying execution times. (Actually it was a large INNER join which Vertica performed as a MERGEJOIN.) I'm guessing the unpredictable execution times -- even when running the same query -- is due to Vertica reusing previously cached results.