The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Copying Hive partioned table data to VErtica using Beeline

edited July 2018 in General Discussion

Hi ,
I have a use case where in i have to copy the data from a specific Hive partition . For example : -

I want to copy the data from the employee hive table which is partitioned by latest_record_ind to one of my vertica table . I need to achieve it without using COPY command .
I believe this can be done using beeline . but i am not sure how do i write my select statement in this case . for example - the synatx that should work is as below --->

beeline -u $BEELINE_SERVER_PROD -n $HIVE_USER --outputformat=tsv2 -e "select `(${HDFS_PARTN_COL})?+.+` FROM ${HDFS_DB}.${HDFS_TABLE_NAME};" | dd bs=1M | /home/ec2-user/vertica/opt/vertica/bin/vsql -h -p 5433 -U id -w 8hyg6d -c 'COPY vertica_schema.Vertica table name FROM LOCAL STDIN DELIMITER E'\''\t'\'' NULL '\''NULL'\'' DIRECT'

In the above example if we want to copy the data from the hive table ( NOT PARTITIONED ) to vertica , it works perfectly fine where as i need to know how do i write my select statement above to copy the data from partitioned hive table as per my req. above ?

FYI - My Hive table is stored in a text format in AWS s3 .

Please help.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file