R UDF - varchar issue
I am trying a R UDF which takes three columns as input (a varchar column, a numeric column and an integer column) and returns a dataframe with a number of columns.
I have provided the skeleton of the R function and the factory function below:
rfunction <- function(x)
{
first <- ifelse(is.na(x[,1]),"Not Available",x[,1])
second <- as.numeric(gsub(",","",x[,2]))
third <- x[,3]
# R calculations
...
dataFrame
# dataFrame with 13 columns
}
rFactoryFunction <- function()
{
inlist <- c("varchar","float", "int")
outlist <- c("varchar","varchar", "int","int","varchar","int","numeric","numeric","numeric","numeric","numeric","numeric","numeric","varchar")
list(name=rFunction, udxtype=c("transform"), intype=inlist, outtype=outlist)
}
The first column is expected to contain unique values (varchar type) from the input varchar column.
However, instead of the varchar values, 1, 2, 3, etc are returned only for the first column. All other columns are returned with proper values.
Could someone provide help on what could be causing this issue?
Thank you.
Ravi
I have provided the skeleton of the R function and the factory function below:
rfunction <- function(x)
{
first <- ifelse(is.na(x[,1]),"Not Available",x[,1])
second <- as.numeric(gsub(",","",x[,2]))
third <- x[,3]
# R calculations
...
dataFrame
# dataFrame with 13 columns
}
rFactoryFunction <- function()
{
inlist <- c("varchar","float", "int")
outlist <- c("varchar","varchar", "int","int","varchar","int","numeric","numeric","numeric","numeric","numeric","numeric","numeric","varchar")
list(name=rFunction, udxtype=c("transform"), intype=inlist, outtype=outlist)
}
The first column is expected to contain unique values (varchar type) from the input varchar column.
However, instead of the varchar values, 1, 2, 3, etc are returned only for the first column. All other columns are returned with proper values.
Could someone provide help on what could be causing this issue?
Thank you.
Ravi
0
Comments
You ran into one the R quirks. The follwing operation returns the level of the value in the first column.
first <- ifelse(is.na(x[,1]),"Not Available",x[,1])
Use the following instead to maintain the character value.
first <- ifelse(is.na(x[,1]),"Not Available",as.character(x[,1]))
Pratibha