Data type conversion inside VErtica C++ ScalarFunction Extension. casting to string

hi im trying to write sclar function which basicly concatenates ll given columns and return concetanated string.
its working  when i give all values as string but i dont know how to convert integer float double or date  vaues to string and concat to the string.
following is the point i need to append integer to a string. but Im not sure how to correctly cast this "vint" variable to a string.

std::string myConcat;
// needd to cast the arg_reader.getIntRef(i) inside append but how ??
myConcat.append(arg_reader.getIntRef(i));

basicly how can i convert any c++ vertica variable to string (integer, double, timestamp, date)

second question is
can i call MD5 sql function inside UDF ?  i cant see md5 class inthe c++ sdk

full code is

/* Copyright (c)2015 memiiso
 *
 * UDF function concatenates all given variables and returns concatenated string
 *
 */
#include "Vertica.h"
using namespace Vertica;
using namespace std;

class ConcatAll : public Vertica::ScalarFunction
{
public:
    virtual void processBlock(Vertica::ServerInterface &srvInterface,
                              Vertica::BlockReader &arg_reader,
                              Vertica::BlockWriter &res_writer)
    {
        // Always catch exceptions to prevent causing the side process or
        // Vertica itself from crashing.
        try 
        {
            // Find the number of arguments sent.
            size_t numCols = arg_reader.getNumCols();
            // Make sure at least 1 arguments were supplied
            if ((int)numCols < 0)
                vt_report_error(0, "Function expects at least 1 parameter");
               
            SizedColumnTypes arg_types = arg_reader.getTypeMetaData();
            do
            { // covert to string and concetaneta all into myConcat.
                std::string myConcat;
                // Loop over all params
                for (int i=0; i<(int)numCols; i++) {
                    const VerticaType &vt = arg_types.getColumnType(i);
                    if(arg_reader.isNull(i)){                   
                        myConcat.append("");
                    } else if (vt.isStringType() || vt.isLongVarchar() || vt.isChar())  {                   
                        myConcat.append(arg_reader.getStringRef(i).str());                   
                        // myConcat.append(arg_reader.getStringPtr(i).str());
                    } else if (vt.isInt()){
                        myConcat.append(arg_reader.getIntRef(i).str());
                    }
                    myConcat.append("^^");
                }
                res_writer.getStringRef().copy(myConcat);
                res_writer.next();
            } while (arg_reader.next());
           
           
        }  catch(exception& e) {
           // Standard exception. Quit.
           vt_report_error(0, "Exception while processing partition: [%s]",
            e.what());
        }
    }
};

// Defines the AddMany function.
class ConcatAllFactory : public Vertica::ScalarFunctionFactory
{
    // Return the function object to process the data.
    virtual Vertica::ScalarFunction *createScalarFunction(
        Vertica::ServerInterface &srvInterface)
    { return vt_createFuncObj(srvInterface.allocator, ConcatAll); }

    // Define the number and types of arguments that this function accepts
    virtual void getPrototype(Vertica::ServerInterface &srvInterface,
                              Vertica::ColumnTypes &argTypes,
                              Vertica::ColumnTypes &returnType)
    {  
        argTypes.addAny(); // Must be only argument type.
        returnType.addVarchar();
    }
   
     virtual void getReturnType(ServerInterface &srvfloaterface,
                               const SizedColumnTypes &inputTypes,
                               SizedColumnTypes &outputTypes)
    {
     outputTypes.addVarchar(65000);
    }
};

RegisterFactory(ConcatAllFactory);


Thank you






















Comments

  • Hi!

    I will answer for first question. Improve a second one (I hate to read unformatted code, pastebin.com and many others services exits, why do not post code there?)

    For my opinion a best way will be to CAST columns values to string inside Vertica. Vertica UDF is prototyped, so to concatenate different types you will require to define a different prototypes.

    Example for prototype:
        virtual void getPrototype(ServerInterface &interface,
    ColumnTypes &argTypes,
    ColumnTypes &returnType)
    {
    argTypes.addInt();
    argTypes.addInt();
    returnType.addVarchar();
    }

    This prototype will accept 2 integers only and Vertica will not accept DATE or any other data type.

    ****************************************

    Say you wrote an UDF that concatenates  2 values to one value:
    concat(varchar, varchar) => return varchar
    So to concatenate 2 columns, no matter what type of columns you can write:
    select conact(col1::varchar, col2::varchar) from table.
    And no more worries about columns types, no worries to write different prototypes for each data type, its too much.

    (int, int)
    (int, date)
    (int, varchar)
    ...
    (date, varchar)
    ...
    (time, int)
    ...
    and all permutations

    ****************************************

    Regards
  • BTW: Vertica provides GROUP_CONCAT and aggregate CONACT UDF function. You can use it as base.

    GROUP_CONCAT
    /opt/vertica/sdk/examples/AggregateFunctions/Concatenate.cpp
  • Hi,

    Thank you for the answer.

    i just wanted to eliminate code repetition by putting logic into funtion without worrying about column type. but now i will go with your suggestion thats best option so far.
    it still would be good to know how to do(<code>col2::varchar</code>) this correctly by c++ or vertica library, to be able to write advanced ScalarFunstions

    sorry didn't know how to format the code
    adding it below
    /* Copyright (c)2015 memiiso   *   * UDF function concatenates all given variables and returns concatenated string   *   */  #include "Vertica.h"  using namespace Vertica;  using namespace std;     class ConcatAll : public Vertica::ScalarFunction  {  public:      virtual void processBlock(Vertica::ServerInterface &srvInterface,                                Vertica::BlockReader &arg_reader,                                Vertica::BlockWriter &res_writer)      {          // Always catch exceptions to prevent causing the side process or          // Vertica itself from crashing.          try          {              // Find the number of arguments sent.              size_t numCols = arg_reader.getNumCols();              // Make sure at least 1 arguments were supplied              if ((int)numCols < 0)                  vt_report_error(0, "Function expects at least 1 parameter");                               SizedColumnTypes arg_types = arg_reader.getTypeMetaData();              do              { // covert to string and concetaneta all into myConcat.                  std::string myConcat;                  // Loop over all params                  for (int i=0; i<(int)numCols; i++) {                      const VerticaType &vt = arg_types.getColumnType(i);                      if(arg_reader.isNull(i)){                                            myConcat.append("");                      } else if (vt.isStringType() || vt.isLongVarchar() || vt.isChar())  {                                            myConcat.append(arg_reader.getStringRef(i).str());                                            // myConcat.append(arg_reader.getStringPtr(i).str());                      } else if (vt.isInt()){                          myConcat.append(arg_reader.getIntRef(i).str());                      }                      myConcat.append("^^");                  }                  res_writer.getStringRef().copy(myConcat);                  res_writer.next();              } while (arg_reader.next());                                    }  catch(exception& e) {             // Standard exception. Quit.             vt_report_error(0, "Exception while processing partition: [%s]",              e.what());          }      }  };     // Defines the AddMany function.  class ConcatAllFactory : public Vertica::ScalarFunctionFactory  {      // Return the function object to process the data.      virtual Vertica::ScalarFunction *createScalarFunction(          Vertica::ServerInterface &srvInterface)      { return vt_createFuncObj(srvInterface.allocator, ConcatAll); }         // Define the number and types of arguments that this function accepts      virtual void getPrototype(Vertica::ServerInterface &srvInterface,                                Vertica::ColumnTypes &argTypes,                                Vertica::ColumnTypes &returnType)      {            argTypes.addAny(); // Must be only argument type.          returnType.addVarchar();      }            virtual void getReturnType(ServerInterface &srvfloaterface,                                 const SizedColumnTypes &inputTypes,                                 SizedColumnTypes &outputTypes)      {       outputTypes.addVarchar(65000);      }  };     RegisterFactory(ConcatAllFactory);  
  • Hi!
    it still would be good to know how to do
    Ok, I will provide an example.
    sorry didn't know how to format the code
    Hm... now Im sorry, didn't get you due my  English. One more time - Im sorry.

    ******************************
    can i call MD5 sql function inside UDF ?
    No, you cant call SQL function, there are no API for this, but you definitely can call any(or almost any) C/C++ library and all UNIX'es/Linux'es has bulit-in MD5 library.
    May be I will provide an example, but you can take a look on:
    • /opt/vertica/sdk/examples/ScalarFunctions/TextConvertUni.cpp
    • /opt/vertica/sdk/examples/SourceFunctions
    a first source uses in iconv a second uses in curl.

    I will update a branch.

    Regards.

  • Hi!

    Here is an example how to cast INT to STRING - http://pastebin.com/N8Zzh6W0

    File:  Cast.cpp

    Compile:
    g++ -D HAVE_LONG_INT_64 -I /opt/vertica/sdk/include -Wall -shared -Wno-unused-value -fPIC -o Cast.so Cast.cpp /opt/vertica/sdk/include/Vertica.cpp
    Deploy (file path - /tmp):
    CREATE LIBRARY CastLib AS '/tmp/Cast.so';
    CREATE FUNCTION int2str AS LANGUAGE 'C++' NAME 'CastFactory' LIBRARY CastLib;
    Examples:
    daniel=> select int2str(123);
    int2str
    --------------------
    123 -*- append -*-
    (1 row)

    daniel=> select int2str(999);
    int2str
    --------------------
    999 -*- append -*-
    (1 row)
    PS
    Example with MD5 I will provide later, little busy now.

    Regards.
  • Hi Ismail!

    Check  please source code - https://github.com/sKwa/vertica/blob/master/UDFCPP/ConcatRow.cpp
    There are a lot of assumptions(read bugs) like line do not exceeds a max length, UDF still do not accepts all data types, I will improve latter.

    Demo:
    daniel=> CREATE LIBRARY ConcatRowLib AS '/tmp/ConcatRow.so';
    CREATE LIBRARY


    daniel=> CREATE FUNCTION concat_row AS LANGUAGE 'C++' NAME 'ConcatRowFactory' LIBRARY ConcatRowLib;
    CREATE FUNCTION


    daniel=> SELECT concat_row('f'::CHAR, 0, 0, 'bar'::VARCHAR(3), 't'::BOOLEAN USING PARAMETERS sep='-') FROM DUAL;
    concat_row
    ----------------
    f-0-0-bar-TRUE
    (1 row)


    daniel=> SELECT concat_row('f'::CHAR, 0, 0, 'bar'::VARCHAR(3), 't'::BOOLEAN USING PARAMETERS sep='^') FROM DUAL;
    concat_row
    ----------------
    f^0^0^bar^TRUE
    (1 row)


    daniel=> SELECT concat_row('f'::CHAR, 0, 0, 'bar'::VARCHAR(3), 'f'::BOOLEAN USING PARAMETERS sep='^^') FROM DUAL;
    concat_row
    --------------------
    f^^0^^0^^bar^^FALSE
    (1 row)


    max length of separator = 16
    daniel=> SELECT concat_row('f', 0, 0, 'bar' USING PARAMETERS sep='') FROM DUAL;
    concat_row
    ------------
    f00bar
    (1 row)

    daniel=> SELECT concat_row('f', 0, 0, 'bar' USING PARAMETERS sep='|') FROM DUAL;
    concat_row
    ------------
    f|0|0|bar
    (1 row)

    Regards.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file