Loading spark dataframe into vertica

 

http://www.sparkexpert.com/2015/04/17/save-apache-spark-dataframe-to-database/

 

Hi i tried to load dataframes using the above link into mysql it worked. But when i tried to load it into vertica database this is the error i am facing

 

Exception in thread “main” java.sql.SQLSyntaxErrorException: [Vertica][VJDBC](5108) ERROR: Type “TEXT” does not exist
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.io.ProtocolStream.readExpectedMessage(Unknown Source)
at com.vertica.dataengine.VDataEngine.prepareImpl(Unknown Source)
at com.vertica.dataengine.VDataEngine.prepare(Unknown Source)
at com.vertica.dataengine.VDataEngine.prepare(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.(Unknown Source)
at com.vertica.jdbc.jdbc4.S4PreparedStatement.(Unknown Source)
at com.vertica.jdbc.VerticaJdbc4PreparedStatementImpl.(Unknown Source)
at com.vertica.jdbc.VJDBCObjectFactory.createPreparedStatement(Unknown Source)
at com.vertica.jdbc.common.SConnection.prepareStatement(Unknown Source)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:275)
at org.apache.spark.sql.DataFrame.createJDBCTable(DataFrame.scala:1611)
at com.sparkread.SparkVertica.JdbctoVertica.main(JdbctoVertica.java:51)
Caused by: com.vertica.support.exceptions.SyntaxErrorException: [Vertica][VJDBC](5108) ERROR: Type “TEXT” does not exist
… 13 more

This error is because vertica db doesn’t support the datatypes(TEXT) which is in the dataframes(parquet file). I do not wanted to type cast the columns since its going to be a performance issue. we are looking to load around 280 million rows. Could you please suggest the best way to load the data into vertica db.

Comments

  • We are planning a beta release Spark to Vertica connector that could handle your scenario. Send me an email sunil.venkayala@hpe.com, I will notify you when this connector available for download.

     

    Thanks

    Sunil

  • I am running into the same issue. Is there any updates on this issue?

  • Hi you can use this following spark-vertica connector for the above issue

     

    https://saas.hpe.com/marketplace/big-data/hpe-vertica-connector-apache-spark

  • Hi Sunil,

     

    I am getting the following exception while saving Spark DataFrame to Vertica database.

    Can you help me out?

     

    Exception in thread "main" java.sql.SQLException: [Vertica][VJDBC](5108) ERROR: Type "TEXT" does not exist
    at com.vertica.util.ServerErrorData.buildException(Unknown Source)
    at com.vertica.dataengine.VQueryExecutor.executeSimpleProtocol(Unknown Source)
    at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
    at com.vertica.jdbc.SStatement.executeNoParams(Unknown Source)
    at com.vertica.jdbc.SStatement.executeUpdate(Unknown Source)
    at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:302)
    at com.hp.spark.ReturnVisitorImportScoreLRFinalOld.main(ReturnVisitorImportScoreLRFinalOld.java:78)

     

    Thanks,
    Raj

     

     

  • Hello,

     

    I am running into the same issue. Are there any news on this one?

     

    Thanks a lot!

    Ira

  • Hello,

     

    When the Vertica table exists with the same column names as the dataFrame (and the corresponding types) the following has worked for me:

     

    String url = "jdbc:vertica://127.0.0.1:5433/DBNAME?user=myusr&password=mypass";

    myDataFrame.write().mode(SaveMode.Append).jdbc(url, "MY_VERTICA_TABLE", new Properties());

     

    Cheers,

    Ira

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file