Options

Kerberos Support in the HDFS connector

Our Hadoop environment is completely Kerberized.  We are running CDH4.  We have followed all of the directions in setting up the HDFS connector on a 7.0.1 Vertica environment.  I am finding very little documentation on then using kerberos when creating the query.

Per the directions we can test with curl and it works fine, but anything in vsql fails with various issues.

My assumption of how it works, please correct me if I am wrong:
1) Become user that Vertica runs as on Vertica node
2) kinit -k -t <PATH TO KEYTAB> <KERB PRINCIPAL NAME>
3) Run the following test query "COPY public.testTable SOURCE Hdfs (url='http://<HADOOP NAME NODE>:50070/webhdfs/v1/tmp/test.txt', username='<KERB PRINCIPAL NAME>');"
4) It copies this fuel into the test table.

We get the following error (sensitive data redacted):
ERROR 3399: Failure in UDx RPC call InvokePlanUDL(): Error calling planUDL() in User Defined Object [Hdfs] at [src/Hdfs.cpp:307], error code: 0, message: [The requested URL returned error: 401. URL: http://<HADOOP NAME NODE>:50070/webhdfs/v1/tmp/test.txt?user.name=<KERB PRINCIPAL NAME>&op=GETFILESTATUS]

Wondering what we are doing wrong, if we are doing something wrong, and how one can pass a kerberos principal as a username in the statement above in step 3.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file