ROLLBACK 2291: Call to ColumnTypes.addAny() is not allowed in Aggregate functions

Sergey_Cherepan_1 · March 2022

Hi,
Apparently, I cannot use Any as column type for aggregate UDx:

Question - how I can implement polymorphism (i.e. my aggregate function will accept different type of arguments)?

Polymorphyism definitely possible for aggregate functions. For example MIN group function will work for any data type - int, char, varchar, date etc.

Can you look how it is possible to have single aggregate UDx function that will handle different argument datatypes.

Thank you
Sergey

P.S.
Aggregate UDx does not support fencing - produce syntax error. That deserved to be mentioned in docs.

Sergey_Cherepan_1 · March 2022

BTW addAny works fine for aggregate UDx output argument.

SergeB · March 2022

@Sergey_Cherepan_1 Thanks for pointing out this restriction. I contacted engineering and they'll be checking if that restriction (introduced many many years ago) still apply.

Sergey_Cherepan_1 · March 2022

After second check, I found mention in docs, that UDx aggregates do not support fenced mode. My bad.

Sergey_Cherepan_1 · March 2022

I did some logging and investigation on aggregate UDx I wrote.
It appears, Vertica is always calling aggregate() with 2048 rows. Method described in docs does not change this number.
Each invocation of aggregate() produces intermediate result that needs to be saved and passed on next step. That creates humongous overhead, as there are way too many intermediate results.
I have an impression that in current implementation aggregate UDx is not usable due to very high overhead.
Please pass to developers.

Bryan_H · March 2022

Another way to increase number of rows passed to UDX is to increase configuration parameter MaxDesiredEEBlockSize, which defaults to 8MB. This can be set at session level when running UDX. Higher values should fetch more rows up to block size into each aggregate call.

Sergey_Cherepan_1 · March 2022

Thanks for recommendation, that does not work.
initAggregate - aggregate is being called for exactly 2048 rows, does not matter column size - bool or long string varchar(256).
That creates humongous overhead - I am forced to save intermediate aggs every 2048 rows (according to my logs). For a billion rows in data row source, intermediate aggs has 500000 rows. That is way too much overhead, to make aggregate UDx useful.

ROLLBACK 2291: Call to ColumnTypes.addAny() is not allowed in Aggregate functions

Answers

Leave a Comment