Undefined Number of Columns in R Polymorphic Function
Vertica 7.0 Programmer's Guide has the following example regarding Polymorphic functions:
RFactory <- function() {
list(name=RFunction, udxtype=c("transform"), intype=c("any"),
outtype=c("any"), outtypecallback=ReturnType)
}
Does outtype = "any" mean the same R UDF can return different number of columns depending on the inputs provided? I have tried using it but it fails everytime.
Some clarification on this would be helpful.
Thank you.
Ravi
RFactory <- function() {
list(name=RFunction, udxtype=c("transform"), intype=c("any"),
outtype=c("any"), outtypecallback=ReturnType)
}
Does outtype = "any" mean the same R UDF can return different number of columns depending on the inputs provided? I have tried using it but it fails everytime.
Some clarification on this would be helpful.
Thank you.
Ravi
0
Comments
Outtype("any") does indeed mean that the R UDF can return a variable number of columns depending on your function.
You must provide an outtypecallback function. The outtypecallback function must have the same parameters passed to it as your main function (x, y where X = input column(s) from Vertica and Y is any parameters). For example:
# Determine the return types based on the input types and sizes
polyTopKReturnType <- function(x,y)
{
ret <- NULL
for( i in 2:nrow(x))
{
rbind(ret,x[i,]) -> ret
}
ret
}
You cannot add new factors to the data.frame that way.
Change those factors to char then you can use the rbind:
Have you looked at the most recent documentation? The section on R was recently updated.
I think that this page:
http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/ExtendingHPVertica/UDx/UDxR/ROuttypeCallba...
provides the details that you are looking for.
I agree that more examples and debugging tips would be useful. I can put in a request to add debugging tips. As far as examples go, can you provide any suggestions or use cases that we might use as an example?
Also, you may be interested in Distributed R: http://www.vertica.com/hp-vertica-products/hp-vertica-distributed-r/ It is a new open source offering from HP Vertica (paid support is also available). Distributed R provides a native R environment that easily interfaces with Vertica and is scalable to accomodate huge data sets.
Thanks,
Chris