Options

Pig connector - query for parameters

Hi All,
I'm trying to use the Pig connector as per docs but doesn't seem to be working if you specify a query to get the parameters. It works fine with a static list of parameters but not with a query.

- This works:
t = LOAD 'sql://{SELECT c1, c2, c3, c4 FROM tbl WHERE key = ? };{1,2,3,4,5}' USING com.vertica.pig.VerticaLoader(...);

- This doesn't work (doesn't return anything because parameters are wrong).
t = LOAD 'sql://{SELECT c1, c2, c3, c4 FROM tbl WHERE key = ? };sql://{ SELECT DISTINCT key FROM tbl }' USING com.vertica.pig.VerticaLoader(...);
* the query sent to Vertica (as per query_profiles) looks like:
SELECT c1, c2, c3 FROM tbl WHERE key = 'SELECT DISTINCT key FROM tbl'

huh ?!

Looking in git (I guess that's the code used to build latest pig-vertica.jar) I see this method:
public void setLocation(String location, Job job) throws IOException {:
...
if (params != null && !params.isEmpty()) {
         if (params.startsWith("sql://")) {
                          params = params.substring("sql://{".length()).replace('}', ' ');
                          setParameters(params); }
        else {
                   params = params.replaceAll("^\\s*\\{","");
                   params = params.replaceAll("\\}\\s*$","");
                   setParameters(params);
                }
}

that apparently in case the params is a query doesn't do anything else than extracting the query string throwing away the sql:// and {} and setting it as the parameter ! Shouldn't it execute the query, get the result and build a csv list of the values (to match the else branch ?!)

https://github.com/vertica/Vertica-Hadoop-Connector/blob/master/pig-connector/com/vertica/pig/Vertic...

Am I missing something ? Anybody using it ? Is this implemented (as stated in the docs) or not ?
Thanks,
TP

Comments

  • Options
    Or set the proper VerticaInputFormat and let the that extract proper parameter values based on query...

    @Overridepublic InputFormat getInputFormat() throws IOException { return new VerticaInputFormat(getQuery(), getParameters()); }

    Will take a better look at the code but if anybody uses this pls let me know if I'm missing something.
  • Options
    Ok... after digging a bit in the code I found the issue, it is supported but you're not supposed to use SELECT with capitals ... that's not supported :))

    public List<InputSplit> getSplits(JobContext context) throws IOException {
    if (params != null && params.startsWith("select")) {


Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file