Options

hadoop vertica connector : does it support tab and other control character as delimiter ?

I have data in HDFS with delimiter as tab. and I am using the following command for loading the data. it is loading the data into the table but all the data goes to single column only. I have tried passing \t into mapred.vertica.input.delimiter but still no luck. It is working fine with delimiter as comma and pipe. Has anyone seen the behaviour before.

hadoop jar hadoop-streaming-*.jar \

-Dmapred.reduce.tasks=0 \

-Dmapred.vertica.output.table.name=a.table1\

-Dmapred.job.queue.name=queue name \

-Dmapred.vertica.hostnames="machine name" \

-Dmapred.vertica.port=5433 \

-Dmapred.vertica.username=vertica \

-Dmapred.vertica.password=****** \

-Dmapred.vertica.database=db name \

-Dmapred.vertica.input.delimiter=0x9 \

-Dmapred.vertica.output.delimiter=0x9 \

-Dnum_mappers=2 \

-input /tmp/input/ \

-output /tmp/output \

-outputformat com.vertica.hadoop.deprecated.VerticaStreamingOutput \

-inputformat com.intuit.bio.common.utils.MultiFileInputFormat \

-mapper /bin/cat


Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file