Spark DF to Vertica : "Driver not capable"

jhun1321 · June 2017

I am trying to write an Spark DataFrame of Rows to Vertica via JDBC Function
in the following manner:

dataframe.write().mode(SaveMode.Append).jdbc(url, table, properties);

This works when there are no NULL values in any of the rows. When
there are NULL values in rows, I am getting the following error:

17/06/27 14:51:12 INFO orc.RecordReaderFactory: Schema is not specified on read. Using file schema.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 (TID 6, localhost): java.sql.SQLFeatureNotSupportedException: [Vertica]JDBC Driver not capable.
at com.vertica.exceptions.ExceptionConverter.toSQLException(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.checkTypeSupported(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.setNull(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:181)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:277)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:276)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$35.apply(RDD.scala:927)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$35.apply(RDD.scala:927)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1881)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1881)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

My environment: Spark 1.6.1, Vertica Vertica Analytic Database v8.1.0-3, Driver vertica-8.1.0_spark1.6_scala2.10.jar, vertica-jdbc-8.1.0-3.jar

Any suggestion how to solve this issue? Thanks

Vinoth78 · June 2020

I am also getting the same error, but in my case null at character columns is causing the problem. If i have null at columns with numeric datatype, i am able to append them to vertica table.

Vinoth78 · June 2020

Then i used df.na.fill("",Array(col1,col2,col3)).write().mode(SaveMode.Append).jdbc(url, table, properties)
It worked.

LenoyJ · June 2020

Also, this thread and this comment in particular for creating a custom dialect that handles TEXT nulls better.

Bryan_H · September 2020

Hi stellaa9x, what version of Spark and Vertica are you running - same as above or hopefully newer? The solution options are a bit different depending on versions.

We're Moving!

Create My New Community Account Now

Spark DF to Vertica : "Driver not capable"

Comments

Leave a Comment