Spark/Scala connecting to Vertica DB with jdbc is failling

msuzukimsuzuki Registered User

I am trying to connect from Spark v2.3.1 with Scala 2.11.8 to Vertica DB with the jdbc.

On your webpage:

It says that I can download the Spark Connector at the following location:

It takes me to a login page, after log-in, there is a message on an yellow box that says that I do not have permission to view that download.

I tried to download from your drivers page, under the Linux package there are 3 jar files in there (vertica-javadoc, vertica-jdbc, vertica-jdbc-8.0.1-0), but I cannot find the Vertica-Spark Connector (i.e.: vertica-8.1.0_spark2.0_scala2.11.jar)

Here is my Spark/Scala jdbc script:

val url = "jdbc:vertica//hostname/DBName?username=username&password=pw"

val query = "SELECT * FROM TABLE;"

val df ="jdbc")
.option("driver", "com.vertica.jdbc.Driver")
.option("url", url)
.option("dbtable", query)

I am using Scala Eclipse IDE, and loaded the 3jar files that come under the Linux download drivers (vertica-javadoc, vertica-jdbc, vertica-jdbc-8.0.1-0)

and I get the error:
Exception in thread "main" java.lang.NullPointerException
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:70)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:115)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:52)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)

What am I doing wrong? Am I missing the Vertica-Spark Connector?

thank you! Markus.


  • Connyrt-emp1Connyrt-emp1 Employee, Registered User

    Hi Markus,

    Starting with Vertica Server version 9.1, the Spark connector jars are distributed as part of the Server RPM. You can download the community edition RPM from:

    The Spark Connector 2.1 works with Spark 2.2 and 2.3. (This is why we are not distributing a separate jar for Spark 2.2 and 2.3).

    You need to place the connector jar in the Spark cluster file system as well as the JDBC driver (FYI, the JDBC driver version 9.1.1 is now backward compatible with old versions of the Vertica Server).

    This is the location where the Spark connector jars are located inside the RPM:


    We distribute two jars:

    vertica-spark2.1_scala2.11.jar (This connector is compatible with Spark 2.2 and Spark 2.3)

    Here is a list of commands you can run to extract the two Spark connector jars from the 9.1.1 Vertica Community Edition RPM:

    Copy rpm to junk dir
    Cd to junk dir
    [[email protected] junk]# rpm -lqp vertica-9.1.1-0.x86_64.RHEL6.rpm | grep spark
    [[email protected] junk]# rpm2cpio vertica-9.1.1-0.x86_64.RHEL6.rpm | cpio -idv ./opt/vertica/packages/SparkConnector/lib/vertica-spark2.0_scala2.11.jar
    2459746 blocks
    [[email protected] junk]# rpm2cpio vertica-9.1.1-0.x86_64.RHEL6.rpm | cpio -idv ./opt/vertica/packages/SparkConnector/lib/vertica-spark2.1_scala2.11.jar
    2459746 blocks
    [[email protected] junk]# ls -l /opt/vertica/packages/SparkConnector/lib
    total 592
    -rw-r--r--. 1 root root 301786 Jul 22 14:08 vertica-spark2.0_scala2.11.jar
    -rw-r--r--. 1 root root 301857 Jul 22 14:08 vertica-spark2.1_scala2.11.jar

    Let us know if this works for you.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file