Options

UDx c++ API is creating instance of ArrayReader for each processed row

Hi,
Vertica UDx c++ API has quite efficient implementation. For example, API never creates new objects per each row.

I am a little bit surprised by implementation of Array::ArrayReader and ArrayWriter - new instance is being created for each processed row.

For example, check an example /opt/vertica/sdk/examples/ScalarFunctions/ArraySlice.cpp:

class ArraySlice : public ScalarFunction {
    void processBlock(ServerInterface &srvInterface,
            BlockReader &argReader,
            BlockWriter &resWriter) override
    {
        do {
            if (argReader.isNull(0) || argReader.isNull(1) || argReader.isNull(2)) {
                resWriter.setNull();
            } else {
                Array::ArrayReader argArray  = argReader.getArrayRef(0);
                const vint slicebegin = argReader.getIntRef(1);
                const vint sliceend   = argReader.getIntRef(2);

                Array::ArrayWriter outArray = resWriter.getArrayRef(0);
                if (slicebegin < sliceend) {
                    for (int i = 0; i < slicebegin && argArray->hasData(); i++) {
                        argArray->next();
                    }
                    for (int i = slicebegin; i < sliceend && argArray->hasData(); i++) {
                        outArray->copyFromInput(*argArray);
                        outArray->next();
                        argArray->next();
                    }
                }
                outArray.commit();  /* finalize the written array elements */
            }
            resWriter.next();
        } while (argReader.next());
    }
};

argReader.getArrayRef is returning new instance of class,:

Array::ArrayReader BlockReader::getArrayRef(size_t idx) const {
    BlockReader &elementsBlock = *static_cast<BlockReader *>(ctWrappers.at(idx).get());
    return Array::ArrayReader(elementsBlock, *this, idx);
}

That is definitely a suboptimal implementation.

Can you ask developers, is it possible to instantiate ArrayReader and ArrayWriter once per call to ProcessBlock, and re-use same instance for every row? Goal would be to avoid creating new ArrayReader/ArrayWriter instances per each row.

Method name getArrayRef is confusing, as it does not return ref.

May be, Vertica can add method getArray, and allow to initialise returned object per each row, re-using class instance and avoiding creating new objects per row. getArrayRef should be deprecated.

Thank you
Sergey

Answers

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file