Integration of Vertica with Hive LLAP

Hi,

Vertica's documentation states that "Vertica [HCatalog Connector] supports authorization services and Hive LLAP."
Vertica's documentation also states that ""When the query executes, all nodes in the Vertica cluster directly retrieve the data necessary for completing the query from HDFS".

What does this mean in terms of Vertica's interaction with Hive LLAP? Does Vertica fetch data from LLAP daemons instead of HDFS?

Thanks in advance.

Comments

  • kguankguan Employee

    Vertica doesn't talk to LLAP daemon directly. Vertica HCatalog Connector contacts HiveServer2, and HiveServer2 can use LLAP to process the Hive query. This integration enables Vertica to use the HiveServer2 Interactive JDBC URL to access Hive external tables which may have data stores on HDFS.

    In order to do that, you need to specify the hive2 directories for hcatUtil using to copy hive conf and jar files into Vertica hcatLibPath. Please refer to:
    https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/HadoopIntegrationGuide/HCatalogConnector/ConfiguringVerticaForHCatalog.htm

  • edited March 2019

    Thank you kguan for your response.

    My understanding of the documentation ( https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/HadoopIntegrationGuide/HCatalogConnector/HowTheHCatalogConnectorWorks.htm ) is that Vertica contacts the HiveServer2 only to retrieves the Hive table's metadata. Query parsing and planning is all performed by Vertica and not Hadoop. Data is fetched from the Hive table by the Vertica engine directly by reading the HDFS files.

    Is this correct? Does this mean that Vertica's compatibility with Hive LLAP solely means compatibility with an "HiveServer2 Interactive" for metadata retrieval, without the use by Vertica of performance optimizations brought by LLAP?

  • kguankguan Employee

    Hi OlivierA, Your understanding is correct. Vertica accesses the data on HDFS directly after retrieved the Hive table's metadata.

    But for the metadata retrieving process, Vertica still gets some performance benefits from LLAP by using the HiveServer2 Interactive JDBC URL.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file