R-function kmeans

Dear All We've implemented the R-function kemans ( pretty muchh following the example given in http://www.vertica.com/2012/10/02/how-to-implement-r-in-vertica/ ) and we've got a table containing 1.000.000 records. However when we use the function in querying the table ... the query seems to stall or hang. Here is the function ( stored in '/home/nl/dietdata/kmeansR.R' ) kmeansClu <- function(x) { cl <- kmeans(x[,2], 3, 15) res <- data.frame(x[,1:2], cl$cluster) res } kmeansCluFactory <- function() { list(name=kmeansClu,udxtype=c("transform"),intype=c("int","float"), outtype=c("int","float","int"), outnames=c("x","y","cluster")) } function implemented in vertica DB CREATE LIBRARY kmeansLib AS '/home/nl/dietdata/kmeansR.R' LANGUAGE 'R'; CREATE TRANSFORM FUNCTION kmeansClu AS LANGUAGE 'R' NAME 'kmeansCluFactory' LIBRARY kmeansLib; GRANT EXECUTE ON TRANSFORM FUNCTION public.kmeansClu(int,float) TO nl; Table definition : CREATE TABLE "nl_schema"."DIET" ( subjectid integer, dietname varchar(100), initial_weight integer, completion varchar(100), weight_at_12months integer, adherence_level integer, weight_loss integer ) ; Query that apparently never completes : create table nl_schema.bla as select kmeansClu(subjectid, initial_weight) over () from nl_schema.diet; Has anyone encountered this problem ?

Comments

  • There must have been a problem with copy / paste The R-function is: kmeansClu <- function(x) { cl <- kmeans(x[,2], 3, 15) res <- data.frame(x[,1:2], cl$cluster) res } kmeansCluFactory <- function() { list(name=kmeansClu,udxtype=c("transform"),intype=c("int","float"), outtype=c("int","float","int"), outnames=c("x","y","cluster")) } Implemented in Db : CREATE LIBRARY kmeansLib AS '/home/nl/dietdata/kmeansR.R' LANGUAGE 'R'; CREATE TRANSFORM FUNCTION kmeansClu AS LANGUAGE 'R' NAME 'kmeansCluFactory' LIBRARY kmeansLib; GRANT EXECUTE ON TRANSFORM FUNCTION public.kmeansClu(int,float) TO nl; Table Definition nl_schema.diet: ( subjectid integer, dietname varchar(100), initial_weight integer, completion varchar(100), weight_at_12months integer, adherence_level integer, weight_loss integer ) Query that apparently never completes : create table nl_schema.bla as select kmeansClu(subjectid, initial_weight) over () from nl_schema.diet;
  • Hi Karin, thanks for sharing your function with us, we'll look into why your query is not completing.
  • 6.1 service pack 2 will fix this issue

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file