spark

testing new spark connector and getting this error

Exception in thread "main" java.lang.ClassNotFoundException: com.vertica.jdbc.Driver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at com.vertica.spark.seg.VUtil$.getConnection(VUtil.scala:125)
at com.vertica.spark.datasource.DefaultSource$$anonfun$6.apply(VerticaSource.scala:37)
at com.vertica.spark.datasource.DefaultSource$$anonfun$6.apply(VerticaSource.scala:37)
at com.vertica.spark.seg.SegmentsMetaInfo$class.initSegInfo(SegmentsMetaInfo.scala:33)
at com.vertica.spark.datasource.DefaultSource.initSegInfo(VerticaSource.scala:15)
at com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:37)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:125)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
at com.novartis.nibr.benchmark.db.VerticaReadTest$.main(VerticaReadTest.scala:46)
at com.novartis.nibr.benchmark.db.VerticaReadTest.main(VerticaReadTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
bash-3.2$

 

I am not sure why I would get that. code is 

...

import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.{SQLContext}
import com.vertica.spark.datasource.
import com.novartis.nibr.benchmark.util.Properties._
import com.novartis.nibr.benchmark.util.Util.time

object VerticaReadTest {

def main(args: Array[String]) {
val conf = new SparkConf().setAppName("read test")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
// read in configuration
// configuration and jdbc info
val jdbcProp = readConfig("vertica.properties")

val host = jdbcProp.getString("vertica.host")
val verticadb = jdbcProp.getString("vertica.db")
val port = jdbcProp.getInt("vertica.port")
val dbSchema = "GEDI" // jdbcProp.getString("vertica.schema")
val user = jdbcProp.getString("vertica.user")
val password = jdbcProp.getString("vertica.pwd")
val tableName = "OBSERVATIONS" // jdbcProp.getString("vertica.tableName")

// OPTIONAL: setup an ip map from Vertica internal IPs to external IPs if needed
val ipmap: String = jdbcProp.getString("vertica.ipmap")

// setup the user options, defaults are shown where applicable for optional values.
val options: Map[String, String] = Map(
"table" -> tableName,
"db" -> verticadb,
"user" -> user,
"password" -> password,
"host" -> host,
// "numPartitions"-> "16" // OPTIONAL (default val shown)
// "tmpdir" -> "/tmp" // OPTIONAL (default val shown)
// "failed_rows_percent_tolerance"-> "0.00" // OPTIONAL (default val shown)
"dbschema" -> dbSchema, // OPTIONAL (default val public)
// "port" -> "5433" // OPTIONAL (default val shown)
"ipmap" -> ipmap // OPTIONAL (default val shown)
)

val df = sqlContext.read.format("com.vertica.spark.datasource.DefaultSource").options(options).load()

time(df.count(), "testing read from vertica")

}
}

 

Comments

  • SruthiASruthiA Administrator

    Hi,

     

       It looks like you have missed installation of vertica jdbc driver. Please install it and try executing the program

     

    Sruthi

  • Thanks for the quick reply. I did install 

    vertica-spark-connector-0.2.0.jar

     

    and built the code with it.

    No errors. Why would I need the jdbc driver? No dependencies in the code. Looks like it is needed at runtime.

     

    I added the actually jdbc jar to spark-submit

    exec spark-submit \
    --master $MASTER \
    --class $BENCH_CLASS \
    --jars /.../jars/vertica-jdbc-7.2.1-0.jar \
    "$BENCH_JAR" \
    "$@"

     

    and now I get the error

     

    Exception in thread "main" java.sql.SQLSyntaxErrorException: [Vertica][VJDBC](3737) ERROR: Invalid projection name OBSERVATION_b0
    at com.vertica.util.ServerErrorData.buildException(Unknown Source)
    at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
    at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
    at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
    at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
    at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
    at com.vertica.jdbc.common.SStatement.executeNoParams(Unknown Source)
    at com.vertica.jdbc.common.SStatement.executeQuery(Unknown Source)
    at com.vertica.spark.seg.SegmentsMetaInfo$class.getSegMap(SegmentsMetaInfo.scala:61)
    at com.vertica.spark.datasource.DefaultSource.getSegMap(VerticaSource.scala:15)
    at com.vertica.spark.seg.SegmentsMetaInfo$class.initSegInfo(SegmentsMetaInfo.scala:50)
    at com.vertica.spark.datasource.DefaultSource.initSegInfo(VerticaSource.scala:15)
    at com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:37)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:125)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
    at com.novartis.nibr.benchmark.db.VerticaReadTest$.main(VerticaReadTest.scala:46)
    at com.novartis.nibr.benchmark.db.VerticaReadTest.main(VerticaReadTest.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

    Caused by: com.vertica.support.exceptions.SyntaxErrorException: [Vertica][VJDBC](3737) ERROR: Invalid projection name OBSERVATION_b0

    I verified that the projection exists! and the table OBSERVATION is actually there and I can query it.

     

     

  • I repeated the test with a table in my user connection schema and which had no build projections (other than the default). This is the table created as the write example in the connector manual. 

     

    Counting that table of 1 record went fine. I repeated the test pointing to a large observation table in a different schema than the connection schema and this one gave me the same error as above.

     

    SO it seems the problem is associated with schemas.... 

  • Hi Nabil,

     

    For the "Invalid projection name" error, please refer to my reply in this topic-thread:

     

    https://community.dev.hpe.com/t5/Vertica-Forum/Vertica-Spark-Connector-0-2-0-Invalid-Projection-issue/m-p/235104/highlight/true#M12121

     

    Thanks,

    Harshad

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file