Hadoop-Vertica Connector: Can I write to Vertica in Map procedure?
Hi all, I am using Hadoop-Vertica Connector to import a large file into Vertica. I was trying to use hadoop to do that without the Reducer. However the vertica output table seems cannot initialize during Mapping procedure, there are errors all the time.
When I check the document, it didn't say that we can write to Vertica during Mapping, so I was wondering if we can do that?
Thank you!
EDIT:
Error:
When I check the document, it didn't say that we can write to Vertica during Mapping, so I was wondering if we can do that?
Thank you!
EDIT:
Error:
java.io.IOException: Cannot set record by name if names not initialized at com.vertica.hadoop.VerticaRecord.set(VerticaRecord.java:270) at com.vertica.hadoop.VerticaWordCount$TokenizerMapper.map(VerticaWordCount.java:92) at com.vertica.hadoop.VerticaWordCount$TokenizerMapper.map(VerticaWordCount.java:60) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doA
Check the source code of VerticaWordCount.java, I found that the name list of the output table was not initialized at all.
Here is my configuration in run():
Job job = new Job(conf, "vertica hadoop"); conf = job.getConfiguration(); conf.set("mapreduce.job.tracker", "local"); //job.setInputFormatClass(VerticaInputFormat.class); //You have to set the MapOutputKeyClass and MapOutputValueClass, //since by default it will be the same as the class of Reducer's //Output Key and Value job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(VerticaRecord.class); /*************Settings for Vertica output************************/ //Set the output format of Reduce class. //I will output VerticaRecords that will be stored in the database job.setOutputKeyClass(Text.class); job.setOutputValueClass(VerticaRecord.class); //Tell Hadoop to send its output to the Vertica job.setOutputFormatClass(VerticaOutputFormat.class); /****************************************************************/ job.setJarByClass(VerticaWordCount.class); job.setMapperClass(TokenizerMapper.class); FileInputFormat.addInputPath(job, new Path("/user/tmp/input")); /******************************************************************/ //Defining the output table //VerticaOutputFormat.setOutput(jobObject, tableName, [truncate, ["columnName1 dataType1" [,"columnNamen dataTypen" ...]] ); VerticaOutputFormat.setOutput(job, "target", true, "a int", "b varchar", "c varchar")
0
Comments
May I please know what is the version of the connector?
Regards
Bhawana