Spark DF to Vertica : "Driver not capable"
I am trying to write an Spark DataFrame of Rows to Vertica via JDBC Function
in the following manner:
dataframe.write().mode(SaveMode.Append).jdbc(url, table, properties);
This works when there are no NULL values in any of the rows. When
there are NULL values in rows, I am getting the following error:
17/06/27 14:51:12 INFO orc.RecordReaderFactory: Schema is not specified on read. Using file schema.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 (TID 6, localhost): java.sql.SQLFeatureNotSupportedException: [Vertica]JDBC Driver not capable.
at com.vertica.exceptions.ExceptionConverter.toSQLException(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.checkTypeSupported(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.setNull(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:181)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:277)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:276)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$35.apply(RDD.scala:927)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$35.apply(RDD.scala:927)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1881)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1881)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
My environment: Spark 1.6.1, Vertica Vertica Analytic Database v8.1.0-3, Driver vertica-8.1.0_spark1.6_scala2.10.jar, vertica-jdbc-8.1.0-3.jar
Any suggestion how to solve this issue? Thanks
Comments
It worked.
Also, this thread and this comment in particular for creating a custom dialect that handles TEXT nulls better.
Hi stellaa9x, what version of Spark and Vertica are you running - same as above or hopefully newer? The solution options are a bit different depending on versions.