Vertica R UDF - variable output
I have a requirement as follows:
1. Input data table of any size
2. Perform K-Means clustering on the same
3. Output the data-table along with one additional column as Cluster
Issue: I am able to create the required data.frame which needs to be returned.
However, i am unable to dynamically specify the output type depending on the input data size.
Structure of R-UDF:
sd_kmeans <- function(data,y)
{
.....
}
sd_kmeans_factory<-function()
{
list(name=sd_kmeans,intype=c("any"),outtype=c("any"),udxtype="transform",parametertypecallback=sd_kmeans_parameters,outtypecallback=sd_kmeans_returnType)
}
sd_kmeans_parameters <- function()
{
..... returns parameter types
}
sd_kmeans_returnType<- function(data,y)
{
..... returns returntypes for main function
}
In sd_kmeans_returnType, one of the input parameters in data, which however is not the actual data set.
Can you please help me to find out what the format of this "data" in this function is?