why R-udf work on single node, but not work on cluster
vsql:RFunctions_test.sql:31: ERROR 3399: Failure in UDx RPC call InvokeProcessPartition(): Error calling processPartition() in User Defined Object [cons] at [/scratch_a/release/vbuild/vertica/UDxFence/RInterface.cpp:1236], error code: 0, message: Exception in processPartitionForR: [cannot open the connection]
This RFunction has been tested on the single node ,it worked fine.
This RFunction has been tested on the single node ,it worked fine.
0
Comments
UDx in R work on multi culster setups. Please make sure that the vertica-R language pack is installed on all nodes. If the problem persists then please file a support case.
The library and transform function can be created successfully , but when i invoke the UDF with below sql query , it pop up the above error message "can not open the connection", and i can see the udx process has died through "ps -ef".
dbadmin 31367 31361 0 Mar04 ? 00:00:00 [vertica-udx-R] <defunct>
dbadmin 31384 31361 0 Mar04 ? 00:00:00 [vertica-udx-R] <defunct>
dbadmin 31385 31361 0 Mar04 ? 00:00:00 [vertica-udx-R] <defunct>
dbadmin 31386 31361 0 Mar04 ? 00:00:00 [vertica-udx-R] <defunct>
dbadmin 31387 31361 0 Mar04 ? 00:00:00 [vertica-udx-R] <defunct>
cons <- function(x)
{
df <- data.frame(x)
j <- NULL
for(j in which(is.na(df[1,]))){
df[1,j] <- df[min(which(!is.na(df[,j]))),j]
}
outdf <- df[1,]
outdf
}
consFactory <- function()
{
list(name=cons ,udxtype=c("transform"),intype=c("any"), outtype=c("any"),outtypecallback=outtype,strictness=c("CALLED_ON_NULL_INPUT") )
}
#outtypecallback function
outtype <- function(x)
{
params <- NULL
params <- data.frame(datatype=rep(NA, 1), length=rep(NA,1), scale=rep(NA,1), name=rep(NA,1) )
for(i in 1:nrow(x))
{
params[i,1] <- "varchar"
}
params
}
DROP TABLE T;
DROP LIBRARY rlib CASCADE;
-- Step 1: Create LIBRARY
\set libfile '\'''pwd''/RFunctions/RFunctions_test.R\''
CREATE LIBRARY rlib AS :libfile LANGUAGE 'R';
-- Step 2: Create Function Factories
CREATE TRANSFORM FUNCTION cons
AS LANGUAGE 'R' NAME 'consFactory' LIBRARY rlib;
/*** Example 1: Multiplication ***/
CREATE TABLE T(qualifier varchar(20) not null,priority int,value1 varchar(20),value2 varchar(20));
COPY T FROM STDIN DELIMITER ',';
qua1,1,,o
qua1,2,b,
qua2,1,,,
qua2,2,d,l
\.
-- Invoke the UDF
SELECT cons(qualifier,priority,value1,value2) OVER(partition by qualifier order by qualifier,priority) FROM T;
This is a known issue and we are working on a fix. In the meantime there are two workarounds. The defunct processes will disappear when you exit the session or when the vertica-udx-R process is killed.
Thanks
Pratibha
We are using connection pooling we can't just close the connection , the kill options is not clean enough , i found that addition dummy call to R function without partition by clause fix it .
Thanks anyway