We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


R-function kmeans — Vertica Forum

R-function kmeans

Dear All We've implemented the R-function kemans ( pretty muchh following the example given in http://www.vertica.com/2012/10/02/how-to-implement-r-in-vertica/ ) and we've got a table containing 1.000.000 records. However when we use the function in querying the table ... the query seems to stall or hang. Here is the function ( stored in '/home/nl/dietdata/kmeansR.R' ) kmeansClu <- function(x) { cl <- kmeans(x[,2], 3, 15) res <- data.frame(x[,1:2], cl$cluster) res } kmeansCluFactory <- function() { list(name=kmeansClu,udxtype=c("transform"),intype=c("int","float"), outtype=c("int","float","int"), outnames=c("x","y","cluster")) } function implemented in vertica DB CREATE LIBRARY kmeansLib AS '/home/nl/dietdata/kmeansR.R' LANGUAGE 'R'; CREATE TRANSFORM FUNCTION kmeansClu AS LANGUAGE 'R' NAME 'kmeansCluFactory' LIBRARY kmeansLib; GRANT EXECUTE ON TRANSFORM FUNCTION public.kmeansClu(int,float) TO nl; Table definition : CREATE TABLE "nl_schema"."DIET" ( subjectid integer, dietname varchar(100), initial_weight integer, completion varchar(100), weight_at_12months integer, adherence_level integer, weight_loss integer ) ; Query that apparently never completes : create table nl_schema.bla as select kmeansClu(subjectid, initial_weight) over () from nl_schema.diet; Has anyone encountered this problem ?

Comments

  • There must have been a problem with copy / paste The R-function is: kmeansClu <- function(x) { cl <- kmeans(x[,2], 3, 15) res <- data.frame(x[,1:2], cl$cluster) res } kmeansCluFactory <- function() { list(name=kmeansClu,udxtype=c("transform"),intype=c("int","float"), outtype=c("int","float","int"), outnames=c("x","y","cluster")) } Implemented in Db : CREATE LIBRARY kmeansLib AS '/home/nl/dietdata/kmeansR.R' LANGUAGE 'R'; CREATE TRANSFORM FUNCTION kmeansClu AS LANGUAGE 'R' NAME 'kmeansCluFactory' LIBRARY kmeansLib; GRANT EXECUTE ON TRANSFORM FUNCTION public.kmeansClu(int,float) TO nl; Table Definition nl_schema.diet: ( subjectid integer, dietname varchar(100), initial_weight integer, completion varchar(100), weight_at_12months integer, adherence_level integer, weight_loss integer ) Query that apparently never completes : create table nl_schema.bla as select kmeansClu(subjectid, initial_weight) over () from nl_schema.diet;
  • Hi Karin, thanks for sharing your function with us, we'll look into why your query is not completing.
  • 6.1 service pack 2 will fix this issue

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file