Vertica R UDF - variable output

I have a requirement as follows:

  1. Input data table of any size

  2. Perform K-Means clustering on the same

  3. Output the data-table along with one additional column as Cluster

 

Issue: I am able to create the required data.frame which needs to be returned.

          However, i am unable to dynamically specify the output type depending on the input data size.

 

Structure of R-UDF:

 

sd_kmeans <- function(data,y)
{

  .....

}

 

sd_kmeans_factory<-function()
{
list(name=sd_kmeans,intype=c("any"),outtype=c("any"),udxtype="transform",parametertypecallback=sd_kmeans_parameters,outtypecallback=sd_kmeans_returnType)
}

 

sd_kmeans_parameters <- function()

{

  ..... returns parameter types

}

 

sd_kmeans_returnType<- function(data,y)

{

  ..... returns returntypes for main function

}

 

In sd_kmeans_returnType, one of the input parameters in data, which however is not the actual data set.

Can you please help me to find out what the format of this "data" in this function is?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file