Options

Copying Hive partioned table data to VErtica using Beeline

edited July 2018 in General Discussion

Hi ,
I have a use case where in i have to copy the data from a specific Hive partition . For example : -
/employee/latest_record_ind=Y/*"

I want to copy the data from the employee hive table which is partitioned by latest_record_ind to one of my vertica table . I need to achieve it without using COPY command .
I believe this can be done using beeline . but i am not sure how do i write my select statement in this case . for example - the synatx that should work is as below --->

beeline -u $BEELINE_SERVER_PROD -n $HIVE_USER --outputformat=tsv2 -e "select `(${HDFS_PARTN_COL})?+.+` FROM ${HDFS_DB}.${HDFS_TABLE_NAME};" | dd bs=1M | /home/ec2-user/vertica/opt/vertica/bin/vsql -h hostname.com -p 5433 -U id -w 8hyg6d -c 'COPY vertica_schema.Vertica table name FROM LOCAL STDIN DELIMITER E'\''\t'\'' NULL '\''NULL'\'' DIRECT'

In the above example if we want to copy the data from the hive table ( NOT PARTITIONED ) to vertica , it works perfectly fine where as i need to know how do i write my select statement above to copy the data from partitioned hive table as per my req. above ?

FYI - My Hive table is stored in a text format in AWS s3 .

Please help.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file