Copy Command Question
I was asked a question by one of my customers and I am hoping someone could provide me with an answer to their question!
Here is the test I got:
This is the copy command that we currently use in order to load the data directly from HDFS. As you can see, as part of the load we are not skipping any column and loading the data from the file as is.
COPY testdate SOURCE Hdfs(url='http://isr-r0-zip-nam-1.lab.il.nice.com:50070/webhdfs/v1/tmp/date.txt',username='ceazip') DELIMITER '|' NULL '' DIRECT ABORT ON ERROR;
As part of a new requirement, we would like to skip some columns during the data upload. I looked in the following document https://thisdataguy.com/2013/12/19/the-vertica-copy-statement/
And I tried to implement the skipping column with success. It seems that in order to use this option, the data uploading needs to be from local path or stdin (not from HDFS) and must include from clause in the command.
The Question:
Does the uploading of the data have to be from a local path or stdin and must it have the FROM clause in the statement?
Reading the documentation I say yes but it would be great if someone could please corroborate that!
Comments
Hi!
You can skip cols by implementing your own filter:
PS
COPY <table> SOURCE Hdfs(url=...)
- its just an implementation of UDL Source