C++ SDK Analytic Function inputReader.next() returns more actual rows than getNumRows()
I successfully created an analytic UDF using the C++ SDK but ran into a case that I can't understand.
For some partitions inputReader.next() returns more rows than I would expect from inputReader.getNumRows().
Which gives the definitive count? It seems from testing I need to write one output value for each row actually encountered from inputReader.next() not the expected number of rows from inputReader.getNumRows().
class WeightedMedian : public AnalyticFunction {
public:
virtual void processPartition(ServerInterface &srvInterface,
AnalyticPartitionReader &inputReader,
AnalyticPartitionWriter &outputWriter) {
try {
// Get what inputReader thinks is the row count
const vint reportedRows = inputReader.getNumRows();
// Count actual rows ourselves as we read
vint rowsRead = 0;
// Read input and count rows
do {
rowsRead++;
// do other stuff
} while (inputReader.next());
// numRows is 26421
// rowsRead is 38345
}
// Output value for each actual rowsRead (not getNumRows() which seems buggy)
for (vint i = 0; i < rowsRead; i++) {
outputWriter.setFloat(0, median);
outputWriter.next();
}
}
0
Answers
nvm I see it now, word "current" ... getNumRows() returns the number of rows in the current block being processed, not the total rows in the entire partition, which may differ as an analytic function may be called multiple times for the same partition with different blocks of data.