Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!
COPY Files from NFS Mount
This is the use case I'm trying to solve: We have files coming from HDFS and we want to load them into a set of Vertica tables. I think we're going to opt for a solution that involves a flex table that we then parse into a set of persisted tables.
Here is my question: When we do the COPY command against the JSON files coming from Hadoop can I put the files on an NFS that is mounted to all the nodes in our Vertica cluster and do a COPY ON ANY NODE? Will that achieve parallelism across the entire cluster? What is the fastest method to load data as I've described above?
Cheers,
Eric
0
Leave a Comment
Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.
Comments
Hi Eric,
Yes. What you've described will work. Also, use DIRECT to bypass the WOS.
https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/AdministratorsGuide/BulkLoadCOPY/UsingParallelLoadStreams.htm
Regards
Gayatri
Thanks for your response. Is this a design pattern you've used before? Just wandering if there is anything else I should be considering. The reason I'm pushing for this solution is so I don't have to copy the files to all nodes (i.e. reduce network traffic).
Yes. A lot of our customers use this approach and have been sucessful.
Thanks
Gayatri