problem with HDFS-vertica connector
I installed HDFS vertica connector on all hosts and I spun up an EMR job with webHDFS enabled on port 9101. 1) For testing webHDFS, I tried this below : ---------------------------------------------------------------------------------------------- /root# curl -i -L "http://hadoop-namenode:9101/webhdfs/tmp/test.txt?op=OPEN&user.name=hadoop" ----------------------------------------------------------------------------------------------- HTTP/1.1 307 TEMPORARY_REDIRECT .... HTTP/1.1 200 OK ......... A|1|2|3 B|4|5|6 2) NOW after verifying "webhdfs", I try to use native COPY from vertica to copy files form HDFS. ---------------------------------------------------------------------------------------------------- -> COPY qa.testtable SOURCE Hdfs(url='http://hadoop-namenode:9101/webhdfs/tmp/test.txt', username='hadoop'); ------------------------------------------------------------------------------------------------------ ERROR 0: Error calling plan() in User Function HdfsFactory at [src/Hdfs.cpp:198], error code: 0, message: No files match [http://hadoop-namenode:9101/webhdfs/tmp/test.txt] Any one has an idea on why the COPY fails, even though the files are present on HDFS ? Please let know if you see any mistake here..
0
Comments
Hi Ravi,
The default port for webHDFS on my distribution was 9101.
The way to enable WEBHDFS is to add a bootstrap action to the EMR job as below.
-> --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop --args "-s,dfs.webhdfs.enabled=true"ruby elastic-mapreduce --create --name "Demo Instance" --alive --num-instances 1 --instance-type m1.small --ami-version latest --log-uri s3://<myawsbucket>/logs --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-daemons --args "-h,dfs.webhdfs.enabled=true"
but it is not working for me.
any help is appreciated