We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Hadoop-Vertica Connector: Can I write to Vertica in Map procedure? — Vertica Forum

Hadoop-Vertica Connector: Can I write to Vertica in Map procedure?

Hi all, I am using Hadoop-Vertica Connector to import a large file into Vertica. I was trying to use hadoop to do that without the Reducer.  However the vertica output table seems cannot initialize during Mapping procedure, there are errors all the time.

When I check the document, it didn't say that we can write to Vertica during Mapping, so I was wondering if we can do that?

Thank you!

EDIT:

Error:
java.io.IOException: Cannot set record by name if names not initialized  at com.vertica.hadoop.VerticaRecord.set(VerticaRecord.java:270)  at com.vertica.hadoop.VerticaWordCount$TokenizerMapper.map(VerticaWordCount.java:92)  at com.vertica.hadoop.VerticaWordCount$TokenizerMapper.map(VerticaWordCount.java:60)  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)  at java.security.AccessController.doPrivileged(Native Method)  at javax.security.auth.Subject.doAs(Subject.java:396)  at org.apache.hadoop.security.UserGroupInformation.doA  

Check the source code of VerticaWordCount.java, I found that the name list of the output table was not initialized at all.

Here is my configuration in run():

  Job job = new Job(conf, "vertica hadoop");    conf = job.getConfiguration();    conf.set("mapreduce.job.tracker", "local");      //job.setInputFormatClass(VerticaInputFormat.class);    //You have to set the MapOutputKeyClass and MapOutputValueClass,     //since by default it will be the same as the class of Reducer's    //Output Key and Value    job.setMapOutputKeyClass(Text.class);    job.setMapOutputValueClass(VerticaRecord.class);      /*************Settings for Vertica output************************/    //Set the output format of Reduce class.     //I will output VerticaRecords that will be stored in the database    job.setOutputKeyClass(Text.class);    job.setOutputValueClass(VerticaRecord.class);      //Tell Hadoop to send its output to the Vertica    job.setOutputFormatClass(VerticaOutputFormat.class);    /****************************************************************/      job.setJarByClass(VerticaWordCount.class);    job.setMapperClass(TokenizerMapper.class);    FileInputFormat.addInputPath(job, new Path("/user/tmp/input"));      /******************************************************************/    //Defining the output table    //VerticaOutputFormat.setOutput(jobObject, tableName, [truncate, ["columnName1 dataType1" [,"columnNamen dataTypen" ...]] );    VerticaOutputFormat.setOutput(job, "target", true, "a int", "b varchar", "c varchar")

Comments

  • Hi,

    May I please know what is the version of the connector?


    Regards
    Bhawana

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file