We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Copying Hive partioned table data to VErtica using Beeline — Vertica Forum

Copying Hive partioned table data to VErtica using Beeline

edited July 2018 in General Discussion

Hi ,
I have a use case where in i have to copy the data from a specific Hive partition . For example : -
/employee/latest_record_ind=Y/*"

I want to copy the data from the employee hive table which is partitioned by latest_record_ind to one of my vertica table . I need to achieve it without using COPY command .
I believe this can be done using beeline . but i am not sure how do i write my select statement in this case . for example - the synatx that should work is as below --->

beeline -u $BEELINE_SERVER_PROD -n $HIVE_USER --outputformat=tsv2 -e "select `(${HDFS_PARTN_COL})?+.+` FROM ${HDFS_DB}.${HDFS_TABLE_NAME};" | dd bs=1M | /home/ec2-user/vertica/opt/vertica/bin/vsql -h hostname.com -p 5433 -U id -w 8hyg6d -c 'COPY vertica_schema.Vertica table name FROM LOCAL STDIN DELIMITER E'\''\t'\'' NULL '\''NULL'\'' DIRECT'

In the above example if we want to copy the data from the hive table ( NOT PARTITIONED ) to vertica , it works perfectly fine where as i need to know how do i write my select statement above to copy the data from partitioned hive table as per my req. above ?

FYI - My Hive table is stored in a text format in AWS s3 .

Please help.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file