Upgrade to 9.0.1 - difficulties transitioning from "SOURCE HDFS(...)" to direct HDFS
With support for HDFS Connector dropped in 9.0.1, I need to convert many HDFS COPY scripts to use direct HDFS syntax. I have successfully done this (and tested), but it only works if I configure just one Hadoop cluster.
That is, I have a hdfs-site.xml and core-site.xml file in directory /etc/hadoop/conf/ENV1 (migrated to all nodes), and set:
ALTER DATABASE srvvertica SET HadoopConfDir = '/etc/hadoop/conf/ENV1';
There is no problem.... SELECT VERIFY_HADOOP_CONF_DIR (); returns no errors and COPY table FROM 'hdfs:///file.dat' works perfectly.
Our database pulls from multiple Hadoop clusters. This seems to be supported, and is documented here: https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/HadoopIntegrationGuide/libhdfs/ConfiguringAccessToHDFS.htm?TocPath=Integrating%20with%20Apache%20Hadoop|Reading%20Directly%20from%20HDFS|_____1
under sub-section, "Using More Than One Hadoop Cluster"
However, when I try to configure both clusters by setting:
ALTER DATABASE srvvertica SET HadoopConfDir = '/etc/hadoop/conf/ENV1:/etc/hadoop/conf/ENV2';
I get following validation error for each node:
SELECT VERIFY_HADOOP_CONF_DIR ();
v_node0001: Configuration at [/etc/hadoop/conf/ENV2] declares defaultFS but it was already declared in the configuration at [/etc/hadoop/conf/ENV1]
If I remove the defaultFS from ENV2 (which doesn't make sense, but worth a shot), I get the opposite error:
v_node0001: No fs.defaultFS parameter found in config files in [/etc/hadoop/conf/ENV2]
I should note, the problem is not with ENV2. If I change back to a single-cluster configuration that points to ENV2, that also works.