We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now

How to read from projections in spark connector? — Vertica Forum

How to read from projections in spark connector?

I am able execute queries on projections in vsql command line but not sure how to access this data on application level in dataframe.

val spark: SparkSession = SparkSession

options: Map[String, String] = Map(
"table" -> "p_f_test",
"db" -> "Test",
"user" -> "foo",
"password" -> "bar",
"numPartitions" -> "10",
"host" -> "localhost",
"hdfs_url" -> "hdfs://localhost:9000/user/dir/",
"web_hdfs_url" -> "webhdfs://localhost:9870/user/dir/",
"dbschema" -> "public")

spark.sql("select * from p_f_test")

output:- Specified relation name "public"."p_f_test" does not exist

But in vsql command line:-
select * from f_test; <------- actual table
id | message | still_here
3 | hello | t
4 | goodbye | f

create projection p_f_test (message,still_here) as select message, still_here from f_test segmented by hash(id) all nodes;
select * from p_f_test; <----------- projection
message | still_here
goodbye | f
hello | t

Is there a way to load projection dataset on application level?

Thanks in advance :smile:

Kind Regards,
Ujali Tyagi


  • edited May 2018

    @ujalityagi you are almost there.
    add the following lines and it will work.

    val df = spark.read.format("com.vertica.spark.datasource.DefaultSource").options(options).load()
    val df2 = spark.sql("select * from p_f_test")

    Also where did you find the list of options Vertica Spark Connector provides?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file