Pig connector - query for parameters
Hi All,
I'm trying to use the Pig connector as per docs but doesn't seem to be working if you specify a query to get the parameters. It works fine with a static list of parameters but not with a query.
- This works:
t = LOAD 'sql://{SELECT c1, c2, c3, c4 FROM tbl WHERE key = ? };{1,2,3,4,5}' USING com.vertica.pig.VerticaLoader(...);
- This doesn't work (doesn't return anything because parameters are wrong).
t = LOAD 'sql://{SELECT c1, c2, c3, c4 FROM tbl WHERE key = ? };sql://{ SELECT DISTINCT key FROM tbl }' USING com.vertica.pig.VerticaLoader(...);
* the query sent to Vertica (as per query_profiles) looks like:
SELECT c1, c2, c3 FROM tbl WHERE key = 'SELECT DISTINCT key FROM tbl'
huh ?!
Looking in git (I guess that's the code used to build latest pig-vertica.jar) I see this method:
public void setLocation(String location, Job job) throws IOException {:
...
if (params != null && !params.isEmpty()) {
if (params.startsWith("sql://")) {
params = params.substring("sql://{".length()).replace('}', ' ');
setParameters(params); }
else {
params = params.replaceAll("^\\s*\\{","");
params = params.replaceAll("\\}\\s*$","");
setParameters(params);
}
}
that apparently in case the params is a query doesn't do anything else than extracting the query string throwing away the sql:// and {} and setting it as the parameter ! Shouldn't it execute the query, get the result and build a csv list of the values (to match the else branch ?!)
https://github.com/vertica/Vertica-Hadoop-Connector/blob/master/pig-connector/com/vertica/pig/Vertic...
Am I missing something ? Anybody using it ? Is this implemented (as stated in the docs) or not ?
Thanks,
TP
I'm trying to use the Pig connector as per docs but doesn't seem to be working if you specify a query to get the parameters. It works fine with a static list of parameters but not with a query.
- This works:
t = LOAD 'sql://{SELECT c1, c2, c3, c4 FROM tbl WHERE key = ? };{1,2,3,4,5}' USING com.vertica.pig.VerticaLoader(...);
- This doesn't work (doesn't return anything because parameters are wrong).
t = LOAD 'sql://{SELECT c1, c2, c3, c4 FROM tbl WHERE key = ? };sql://{ SELECT DISTINCT key FROM tbl }' USING com.vertica.pig.VerticaLoader(...);
* the query sent to Vertica (as per query_profiles) looks like:
SELECT c1, c2, c3 FROM tbl WHERE key = 'SELECT DISTINCT key FROM tbl'
huh ?!
Looking in git (I guess that's the code used to build latest pig-vertica.jar) I see this method:
public void setLocation(String location, Job job) throws IOException {:
...
if (params != null && !params.isEmpty()) {
if (params.startsWith("sql://")) {
params = params.substring("sql://{".length()).replace('}', ' ');
setParameters(params); }
else {
params = params.replaceAll("^\\s*\\{","");
params = params.replaceAll("\\}\\s*$","");
setParameters(params);
}
}
that apparently in case the params is a query doesn't do anything else than extracting the query string throwing away the sql:// and {} and setting it as the parameter ! Shouldn't it execute the query, get the result and build a csv list of the values (to match the else branch ?!)
https://github.com/vertica/Vertica-Hadoop-Connector/blob/master/pig-connector/com/vertica/pig/Vertic...
Am I missing something ? Anybody using it ? Is this implemented (as stated in the docs) or not ?
Thanks,
TP
0
Comments
@Overridepublic InputFormat getInputFormat() throws IOException { return new VerticaInputFormat(getQuery(), getParameters()); }
Will take a better look at the code but if anybody uses this pls let me know if I'm missing something.
public List<InputSplit> getSplits(JobContext context) throws IOException {
if (params != null && params.startsWith("select")) {