Hi Dan, The only integrated method of doing this that I could find is covered in the Programmer's Guide, Using the Hadoop Connector, Using Hadoop Streaming with HP Vertica's Hadoop Connector section. In the section "Reading Data from HP Vertica in a Streaming Hadoop Job" it covers using the connector to connect to Vertica and retrieve the contents of a table and post it to a comma delimited key value pair file on the HDFS file system on each node in the Hadoop cluster. We don't have any benchmark info on this so you'd have to test in your env to see whether this is faster than issuing selects in Vertica that get output to a delimited file and then copied over to your Hadoop system. There isn't anything specific to HDFS4 that I could find. The Supported Plaftorms doc for 6.1 gives the Connector requirements and doesn't discuss the HDFS at all. There is an HDFS Connector as well but it looks to be Hadoop to Vertica only and just provides a was for Vertica to directly access files in Hadoop via WedHDFS for things like external tables. Thanks Bhawana
Comments
I have requirement like :
need to transfer/store last 3 months data to hadoop and current month data in vertica.
For doing what are all the step need to perform at vertica end and hadoop end.
Is it required any third party tool like SQOOP to transfer data fom vertica to hadoop .
How to do automate this activity?
Could you please explain.