[UDF] Custom intermediate values in User-Defined Aggregate Functions?
Hello, Currently Vertica::IntermediateAggs class seems to be the only way to pass data between multiple instances of an aggregate function. This works fine when data is reduced to single scalar values during aggregation. However, consider an aggregate function which has a complex state --- for example, a matrix of arbitrary dimensions, or a non-trivial C++ data structure such as std::map. Looks like it's impossible to pass such state in an IntermediateAggs instance. Is there any way to write UDAFs with complex state / intermediate values that I am missing? To provide some context on what I'm trying to do: I'm attempting to port a Linear Probabilistic Counter (concisely explained in this gist) to Vertica in a form of an aggregate function. This function's intermediate value is a bitset, which I can't store in an IntermediateAggs instance. Computing the element count from bitset at the end of aggregate() and storing it in IntermediateAggs is not an option, since it severely skews the result, in the worst case multiplying it by the number of sub-aggregation runs; so, this must be done only once in terminate().
0
Comments