Why am I gettin below exception while reading the data from vertica DB using vertica-spark connector

java.lang.RuntimeException: Type array[varchar is not supported.
at scala.sys.package$.error(package.scala:27)
at com.vertica.spark.seg.SegmentsMetaInfo$$anonfun$getSyntheticSegExpr$2.apply(SegmentsMetaInfo.scala:228)
at com.vertica.spark.seg.SegmentsMetaInfo$$anonfun$getSyntheticSegExpr$2.apply(SegmentsMetaInfo.scala:225)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at com.vertica.spark.seg.SegmentsMetaInfo$class.getSyntheticSegExpr(SegmentsMetaInfo.scala:225)
at com.vertica.spark.datasource.DefaultSource.getSyntheticSegExpr(VerticaSource.scala:15)
at com.vertica.spark.seg.SegmentsMetaInfo$class.initSegInfo(SegmentsMetaInfo.scala:50)
at com.vertica.spark.datasource.DefaultSource.initSegInfo(VerticaSource.scala:15)
at com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:44)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at com.turn.platform.verticaconnectorpoc.BasicReadWriteExamplesConnectorSQLContext$.main(BasicReadWriteExamplesConnectorSQLContext.scala:66)
at com.turn.platform.verticaconnectorpoc.BasicReadWriteExamplesConnectorSQLContext.main(BasicReadWriteExamplesConnectorSQLContext.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Answers

  • SruthiASruthiA Administrator

    what is the vertica version you are using.. is it a old connector? if so, use latest connector with vertica 11.1.1 and above.

    https://github.com/vertica/spark-connector

  • I am using Vertica Database 23.04 & below jars because of spark 2.3 & scala 2.11 version dependency.

    • vertica-spark2.1_scala2.11.jar
    • vertica-jdbc-11.0.0-0.jar
  • SruthiASruthiA Administrator

    I think you are using old connector. could you please try the new connector from github?

  • I tried using the latest version of the Vertica spark connector (3.3.5) but still got the same error.

  • SruthiASruthiA Administrator

    Could you please share me your code?

  • edited January 6

    Here is the complete code:

    **object BasicReadWriteExamplesConnector {

    def main(args: Array[String]): Unit = {

      val spark = SparkSession.builder()
        .appName("Vertica - Spark Connector Scala Example")
        .enableHiveSupport()
        .master("yarn")
        .config("spark.jars", "/tmp/vertica-spark-3.3.5.jar,/tmp/vertica-jdbc-23.4.0-0.jar")
        .getOrCreate()
    
      spark.sparkContext.setLogLevel("INFO")
    
    val options = Map(
      "host" -> hostname,
      "user" -> username,
      "db" -> dbname,
      "hdfs_url" -> hdfs_url,
      "password" -> password,
      "table" -> tableName,
      "query" -> "select string_column_name from {dbname.tableName} limit 10;"
    )
      val VERTICA_SOURCE = "com.vertica.spark.datasource.DefaultSource"
    
    try {
    
      val dfRead = spark.read.format(VERTICA_SOURCE)
        .options(options)
        .load()
    
      dfRead.show()
    
    } catch {
        case ex: Exception => println(ex.printStackTrace())
    } finally {
      spark.close()
    }
    

    }
    }**

  • SruthiASruthiA Administrator

    I notice that you are still using old default source class. Could you please change it as below

    val VERTICA_SOURCE = "com.vertica.spark.datasource.VerticaSource"

    https://docs.vertica.com/24.4.x/en/spark-integration/migrating-from-legacy-spark-connector/#defaultsource-class-renamed-verticasource

  • When I tried using the "com.vertica.spark.datasource.VerticaSource" I got the below error:

    java.lang.ClassNotFoundException: Failed to find data source: com.vertica.spark.datasource.VerticaSource. Please find packages at http://spark.apache.org/third-party-projects.html
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
    at com.turn.platform.verticaconnectorpoc.BasicReadWriteExamplesConnector$.main(BasicReadWriteExamplesConnector.scala:70)
    at com.turn.platform.verticaconnectorpoc.BasicReadWriteExamplesConnector.main(BasicReadWriteExamplesConnector.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Caused by: java.lang.ClassNotFoundException: com.vertica.spark.datasource.VerticaSource.DefaultSource
    at java.lang.ClassLoader.findClass(ClassLoader.java:523)
    at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.java:35)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
    at org.apache.spark.util.ChildFirstURLClassLoader.loadClass(ChildFirstURLClassLoader.java:48)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:634)
    at scala.util.Try$.apply(Try.scala:192)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:634)
    at scala.util.Try.orElse(Try.scala:84)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:634)

  • SruthiASruthiA Administrator

    you need to import the below package.. it is newly added

    import com.vertica.spark._

  • I am still getting the same error even after importing the "import com.vertica.spark._". Is there any detailed document on how vertica spark connector working?

  • SruthiASruthiA Administrator

    spark connector is open source and everything is in the github..... Please review the videos... your connector is somehow is using old version and hence you are receiving error, complex types not supported.

    https://github.com/vertica/spark-connector/tree/main

  • Now I am seeing below exception:
    25/01/08 09:08:39 ERROR server.TransportRequestHandler: Error while invoking RpcHandler#receive() on RPC id 5898923294969719137
    java.io.InvalidClassException: scala.concurrent.duration.FiniteDuration; local class incompatible: stream classdesc serialVersionUID = -983411668277352342, local class serialVersionUID = -4594686286536372853
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1852)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2186)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
    at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:271)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:320)
    at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:270)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:269)
    at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:611)
    at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
    at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:180)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:103)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:750)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file