We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


C++ SDK Analytic Function inputReader.next() returns more actual rows than getNumRows() — Vertica Forum

C++ SDK Analytic Function inputReader.next() returns more actual rows than getNumRows()

I successfully created an analytic UDF using the C++ SDK but ran into a case that I can't understand.
For some partitions inputReader.next() returns more rows than I would expect from inputReader.getNumRows().

Which gives the definitive count? It seems from testing I need to write one output value for each row actually encountered from inputReader.next() not the expected number of rows from inputReader.getNumRows().

https://docs.vertica.com/11.1.x/sdkdocs/CppSDK/class_vertica_1_1_vertica_block.htm#a3093f20e18aeae81eb2afb4be5dbe5de

class WeightedMedian : public AnalyticFunction {
public:
    virtual void processPartition(ServerInterface &srvInterface,
                                 AnalyticPartitionReader &inputReader,
                                 AnalyticPartitionWriter &outputWriter) {
        try {
            // Get what inputReader thinks is the row count
            const vint reportedRows = inputReader.getNumRows();

            // Count actual rows ourselves as we read
            vint rowsRead = 0;

            // Read input and count rows
            do {
                rowsRead++;        
                 // do other stuff
            } while (inputReader.next());

            // numRows is 26421
            // rowsRead is 38345
        }   

           // Output value for each actual rowsRead (not getNumRows() which seems buggy)
            for (vint i = 0; i < rowsRead; i++) {
                outputWriter.setFloat(0, median);
                outputWriter.next();  
            }
}

Answers

  • edited September 4

    nvm I see it now, word "current" ... getNumRows() returns the number of rows in the current block being processed, not the total rows in the entire partition, which may differ as an analytic function may be called multiple times for the same partition with different blocks of data.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file