how to backup vertica in HDFS

We wanna backup a 45 terabytes vertica table in HDFS4 using sequence file format. What is the best way or fastet way to do so ? Thank you


  • Options
    Hi Dan, The only integrated method of doing this that I could find is covered in the Programmer's Guide, Using the Hadoop Connector, Using Hadoop Streaming with HP Vertica's Hadoop Connector section. In the section "Reading Data from HP Vertica in a Streaming Hadoop Job" it covers using the connector to connect to Vertica and retrieve the contents of a table and post it to a comma delimited key value pair file on the HDFS file system on each node in the Hadoop cluster. We don't have any benchmark info on this so you'd have to test in your env to see whether this is faster than issuing selects in Vertica that get output to a delimited file and then copied over to your Hadoop system. There isn't anything specific to HDFS4 that I could find. The Supported Plaftorms doc for 6.1 gives the Connector requirements and doesn't discuss the HDFS at all. There is an HDFS Connector as well but it looks to be Hadoop to Vertica only and just provides a was for Vertica to directly access files in Hadoop via WedHDFS for things like external tables. Thanks Bhawana
  • Options
    You might want to give SQOOP a try. There are many other options but it depends on exactly what are the requirements other than loading it in Hadoop.
  • Options
    Hi Bhuvana,

       I have requirement like :

    need to transfer/store last 3 months data to hadoop and current month data in vertica.

    For doing what are all the step need to perform at vertica end and hadoop end.

    Is it required any third party tool like SQOOP to transfer data fom vertica to hadoop .

    How to do automate this activity?

    Could you please explain.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file