Export Parquet files to HDFS
I want to export a table to HDFS, but I got the error below, what is my problem?
shr2_vrt_itg_001=> select * from pa_itg.test;
a | b | c | d
----+----+----+-------
1 | 3 | 5 | odds
11 | 13 | 19 | odds
10 | 14 | 8 | evens
(3 rows)
The hdfs directory has the permission of 766
shr2_vrt_itg_001=> ! hdfs dfs -ls -d /MAPR/DATA
drwxrw-rw- - srvc_haven_hitg ldap_haven_hitg 0 2019-02-12 23:26 /MAPR/DATA
shr2_vrt_itg_001=> EXPORT TO PARQUET(directory = 'hdfs:///MAPR/DATA') AS SELECT * FROM pa_itg.test;
WARNING 2005: Directory [/etc/hadoop/conf] does not exist
ERROR 8198: Unable to verify if directory [hdfs:///MAPR/DATA/] exists due to 'Error listing directory [hdfs:///MAPR/DATA] Could not connect to [hdfs://]'
The Vertica account (e.g. shr2_vrt_itg_001) I ran is not a DB admin account.
Appreciate your help on this!
Liz
Comments
Is it because I run this export on a local server other than Vertica server?
Yes. when you try to use HDFS URL's vertica tries to get Hadoop cluster information from the xml files present in /etc/hadoop/conf (default location where vertica checks for those config files if HadoopConfDir parameter is not set). Please visit the following URL for more information
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/HadoopIntegrationGuide/libhdfs/ConfiguringAccessToHDFS.htm