Data type conversion inside VErtica C++ ScalarFunction Extension. casting to string
hi im trying to write sclar function which basicly concatenates ll given columns and return concetanated string.
its working when i give all values as string but i dont know how to convert integer float double or date vaues to string and concat to the string.
following is the point i need to append integer to a string. but Im not sure how to correctly cast this "vint" variable to a string.
std::string myConcat;
// needd to cast the arg_reader.getIntRef(i) inside append but how ??
myConcat.append(arg_reader.getIntRef(i));
basicly how can i convert any c++ vertica variable to string (integer, double, timestamp, date)
second question is
can i call MD5 sql function inside UDF ? i cant see md5 class inthe c++ sdk
full code is
/* Copyright (c)2015 memiiso
*
* UDF function concatenates all given variables and returns concatenated string
*
*/
#include "Vertica.h"
using namespace Vertica;
using namespace std;
class ConcatAll : public Vertica::ScalarFunction
{
public:
virtual void processBlock(Vertica::ServerInterface &srvInterface,
Vertica::BlockReader &arg_reader,
Vertica::BlockWriter &res_writer)
{
// Always catch exceptions to prevent causing the side process or
// Vertica itself from crashing.
try
{
// Find the number of arguments sent.
size_t numCols = arg_reader.getNumCols();
// Make sure at least 1 arguments were supplied
if ((int)numCols < 0)
vt_report_error(0, "Function expects at least 1 parameter");
SizedColumnTypes arg_types = arg_reader.getTypeMetaData();
do
{ // covert to string and concetaneta all into myConcat.
std::string myConcat;
// Loop over all params
for (int i=0; i<(int)numCols; i++) {
const VerticaType &vt = arg_types.getColumnType(i);
if(arg_reader.isNull(i)){
myConcat.append("");
} else if (vt.isStringType() || vt.isLongVarchar() || vt.isChar()) {
myConcat.append(arg_reader.getStringRef(i).str());
// myConcat.append(arg_reader.getStringPtr(i).str());
} else if (vt.isInt()){
myConcat.append(arg_reader.getIntRef(i).str());
}
myConcat.append("^^");
}
res_writer.getStringRef().copy(myConcat);
res_writer.next();
} while (arg_reader.next());
} catch(exception& e) {
// Standard exception. Quit.
vt_report_error(0, "Exception while processing partition: [%s]",
e.what());
}
}
};
// Defines the AddMany function.
class ConcatAllFactory : public Vertica::ScalarFunctionFactory
{
// Return the function object to process the data.
virtual Vertica::ScalarFunction *createScalarFunction(
Vertica::ServerInterface &srvInterface)
{ return vt_createFuncObj(srvInterface.allocator, ConcatAll); }
// Define the number and types of arguments that this function accepts
virtual void getPrototype(Vertica::ServerInterface &srvInterface,
Vertica::ColumnTypes &argTypes,
Vertica::ColumnTypes &returnType)
{
argTypes.addAny(); // Must be only argument type.
returnType.addVarchar();
}
virtual void getReturnType(ServerInterface &srvfloaterface,
const SizedColumnTypes &inputTypes,
SizedColumnTypes &outputTypes)
{
outputTypes.addVarchar(65000);
}
};
RegisterFactory(ConcatAllFactory);
Thank you
its working when i give all values as string but i dont know how to convert integer float double or date vaues to string and concat to the string.
following is the point i need to append integer to a string. but Im not sure how to correctly cast this "vint" variable to a string.
std::string myConcat;
// needd to cast the arg_reader.getIntRef(i) inside append but how ??
myConcat.append(arg_reader.getIntRef(i));
basicly how can i convert any c++ vertica variable to string (integer, double, timestamp, date)
second question is
can i call MD5 sql function inside UDF ? i cant see md5 class inthe c++ sdk
full code is
/* Copyright (c)2015 memiiso
*
* UDF function concatenates all given variables and returns concatenated string
*
*/
#include "Vertica.h"
using namespace Vertica;
using namespace std;
class ConcatAll : public Vertica::ScalarFunction
{
public:
virtual void processBlock(Vertica::ServerInterface &srvInterface,
Vertica::BlockReader &arg_reader,
Vertica::BlockWriter &res_writer)
{
// Always catch exceptions to prevent causing the side process or
// Vertica itself from crashing.
try
{
// Find the number of arguments sent.
size_t numCols = arg_reader.getNumCols();
// Make sure at least 1 arguments were supplied
if ((int)numCols < 0)
vt_report_error(0, "Function expects at least 1 parameter");
SizedColumnTypes arg_types = arg_reader.getTypeMetaData();
do
{ // covert to string and concetaneta all into myConcat.
std::string myConcat;
// Loop over all params
for (int i=0; i<(int)numCols; i++) {
const VerticaType &vt = arg_types.getColumnType(i);
if(arg_reader.isNull(i)){
myConcat.append("");
} else if (vt.isStringType() || vt.isLongVarchar() || vt.isChar()) {
myConcat.append(arg_reader.getStringRef(i).str());
// myConcat.append(arg_reader.getStringPtr(i).str());
} else if (vt.isInt()){
myConcat.append(arg_reader.getIntRef(i).str());
}
myConcat.append("^^");
}
res_writer.getStringRef().copy(myConcat);
res_writer.next();
} while (arg_reader.next());
} catch(exception& e) {
// Standard exception. Quit.
vt_report_error(0, "Exception while processing partition: [%s]",
e.what());
}
}
};
// Defines the AddMany function.
class ConcatAllFactory : public Vertica::ScalarFunctionFactory
{
// Return the function object to process the data.
virtual Vertica::ScalarFunction *createScalarFunction(
Vertica::ServerInterface &srvInterface)
{ return vt_createFuncObj(srvInterface.allocator, ConcatAll); }
// Define the number and types of arguments that this function accepts
virtual void getPrototype(Vertica::ServerInterface &srvInterface,
Vertica::ColumnTypes &argTypes,
Vertica::ColumnTypes &returnType)
{
argTypes.addAny(); // Must be only argument type.
returnType.addVarchar();
}
virtual void getReturnType(ServerInterface &srvfloaterface,
const SizedColumnTypes &inputTypes,
SizedColumnTypes &outputTypes)
{
outputTypes.addVarchar(65000);
}
};
RegisterFactory(ConcatAllFactory);
Thank you
0
Comments
I will answer for first question. Improve a second one (I hate to read unformatted code, pastebin.com and many others services exits, why do not post code there?)
For my opinion a best way will be to CAST columns values to string inside Vertica. Vertica UDF is prototyped, so to concatenate different types you will require to define a different prototypes.
Example for prototype: This prototype will accept 2 integers only and Vertica will not accept DATE or any other data type.
****************************************
Say you wrote an UDF that concatenates 2 values to one value: So to concatenate 2 columns, no matter what type of columns you can write: And no more worries about columns types, no worries to write different prototypes for each data type, its too much.
(int, int)
(int, date)
(int, varchar)
...
(date, varchar)
...
(time, int)
...
and all permutations
****************************************
Regards
GROUP_CONCAT
Thank you for the answer.
i just wanted to eliminate code repetition by putting logic into funtion without worrying about column type. but now i will go with your suggestion thats best option so far.
it still would be good to know how to do(<code>col2::varchar</code>) this correctly by c++ or vertica library, to be able to write advanced ScalarFunstions
sorry didn't know how to format the code
adding it below
****************************** No, you cant call SQL function, there are no API for this, but you definitely can call any(or almost any) C/C++ library and all UNIX'es/Linux'es has bulit-in MD5 library.
May be I will provide an example, but you can take a look on:
- /opt/vertica/sdk/examples/ScalarFunctions/TextConvertUni.cpp
- /opt/vertica/sdk/examples/SourceFunctions
a first source uses in iconv a second uses in curl.I will update a branch.
Regards.
Here is an example how to cast INT to STRING - http://pastebin.com/N8Zzh6W0
File: Cast.cpp
Compile: Deploy (file path - /tmp): Examples: PS
Example with MD5 I will provide later, little busy now.
Regards.
Check please source code - https://github.com/sKwa/vertica/blob/master/UDFCPP/ConcatRow.cpp
There are a lot of assumptions(read bugs) like line do not exceeds a max length, UDF still do not accepts all data types, I will improve latter.
Demo: max length of separator = 16 Regards.