Vertica R UDF - variable output

I have a requirement as follows:

  1. Input data table of any size

  2. Perform K-Means clustering on the same

  3. Output the data-table along with one additional column as Cluster


Issue: I am able to create the required data.frame which needs to be returned.

          However, i am unable to dynamically specify the output type depending on the input data size.


Structure of R-UDF:


sd_kmeans <- function(data,y)






sd_kmeans_parameters <- function()


  ..... returns parameter types



sd_kmeans_returnType<- function(data,y)


  ..... returns returntypes for main function



In sd_kmeans_returnType, one of the input parameters in data, which however is not the actual data set.

Can you please help me to find out what the format of this "data" in this function is?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file