Data can be loaded from Vertica using an RODBC connector. We provide a fast ODBC connector called vRODBC. HPDGLM package has dataLoader function that make concurrent ODBC connections to load data in parallel from Vertica.
If you are planning to use vRODBC make sure you have installed vRODBC. Please check the vRODBC installation document.
After installing vRODBC check the vRODBC connection first and make sure it's working ok.
Once vRODBC done you can use db2darrays function to load data from vertica to Distributed R.
And db2darrays is a simple function to load a labeled dataset (a set of labeled samples) from a table in Vertica database to a pair of darrays which correspond to responses and predictors of apredictive model. It is assumed that samples (including responses and predictors) are stored in a single table, and the table contains a column called ’rowid’. It is also assumed that ’rowid’ starts from 0, and there is no missed ’rowid’. Usage db2darrays(tableName, resp = list(...), pred = list(...), conf,nSplits)
Comments
Data can be loaded from Vertica using an RODBC connector. We provide a fast ODBC connector called vRODBC. HPDGLM package has dataLoader function that make concurrent ODBC connections to load data in parallel from Vertica.
After installing vRODBC check the vRODBC connection first and make sure it's working ok.
Once vRODBC done you can use db2darrays function to load data from vertica to Distributed R.
And db2darrays is a simple function to load a labeled dataset (a set of labeled samples) from a table in Vertica database to a pair of darrays which correspond to responses and predictors of apredictive model. It is assumed that samples (including responses and predictors) are stored in a single table, and the table contains a column called ’rowid’. It is also assumed that ’rowid’ starts
from 0, and there is no missed ’rowid’. Usage db2darrays(tableName, resp = list(...), pred = list(...), conf,nSplits)
Example
create table mortgage(rowid int, def int, mltvspline1 float, mltvspline2 float, agespline1 float, agespline2 float, hpichgspline float, ficospline float);insert into mortgage values(0, 1, 0.760777, 0.006632, 0.948052, 0.906403, 0.058021, 0.960328);
insert into mortgage values(1, 0, 0.135741, 0.205449, 0.516031, 0.013455, 0.827438, 0.659125);
insert into mortgage values(2, 0, 0.021796, 0.138996, 0.862165, 0.034211, 0.150524, 0.345917);
insert into mortgage values(3, 1, 0.271257, 0.543280, 0.940978, 0.891880, 0.993050, 0.000160);
insert into mortgage values(4, 1, 0.986207, 0.053896, 0.119611, 0.646744, 0.819753, 0.663289);
loadedSamples <- db2darray ("mortgage", list("mltvspline1", "mltvspline2", "agespline1","agespline2", "hpichgspline", "ficospline"), conf="Test")
loadedData <- db2darrays ("mortgage", list("def"), list("mltvspline1", "mltvspline2","agespline1", "agespline2", "hpichgspline", "ficospline"), conf="Test")