Options

Invoke VSQL through java program

I am using below command to extract the data from vertica db,
vsql -h XX -U XX -W -c "select * from emp" -o "c:/test1.dat" -t -F "|" -At
 but how do i invoke this from Java, when i look at the documents all ref. to only Copy and Stm to be processed, If we can invoke VSQL then it might be pretty easy to get the data from Vertica database. or locate me the .jar file of VSQL which could help me to invoke? any thoughts?

Comments

  • Options
    Thanks Daniel, I had written exactly similar to this but its works good with 1 million records or less, when it exceeds more than that, it performance is very bad,

    so i am trying to find out the approach to invoke vsql.

    I notice there is around 50% better performance with vsql and also i try to looking for bulk extract and load to file

    and also split the file after every 2 GB,

    but nothing is solving the problem,

    hence i am trying to see the vsql invoke and what was code used for vsql to extract much faster than java, is it python?

    What does the special with VSQL to extract much faster than reading and writing the data through Java.

    I would suggest, if this class included in JDBC  driver to invoke vsql, which solves lots of problems. 

  • Options
    Hi!

    >>its works good with 1 million records or less, when it exceeds more than that, it performance is very bad
    From my experience everyone who works with Vertica + JAVA looses about 30% from performance. If it works with 1M rows, so do it in parts:
        select ...bla...bla...bla... where row_id <= 1000000
        select ...bla...bla...bla... where row_id >= 1000000 and row_id <= 2000000
    and so on

    Benefits:
    • no degrades in performance
    • you can do it in parallel to different files, after it concatenate them to a single file.
    >> I notice there is around 50% better performance with vsql and also i try to looking for bulk extract and load to file
    Of cause, VSQL returns strings and JDBC returns objects (if it date so it date, if it integer so you get integer and not string and you can perform calculation on it). VSQL and JAVA its a different things - Java is programming language and VSQL its a db client. How you can compare it?


    >> hence i am trying to see the vsql invoke and what was code used for vsql to extract much faster than java, is it python?
    I did some tests and my tests shows me that Python works better, than JAVA.

    >> I would suggest, if this class included in JDBC  driver to invoke vsql, which solves lots of problems. 
    Forget about it, it increases dependencies - with java you will require a VSQL(no way - we need minimal dependencies). You want do it with VSQL so do it directly with VSQL, don't call it from JAVA.

    BTW:
    • take a look on External Procedures. You can install it and after it to invoke from Vertica(via JDBC). So write an EP that extracts data without JAVA and call for EP from JDBC.
    • review a Vertica MarcketPlace >> ETL and Data Ingest - there are parallel UDF for export data
    image


    Tuning Java Virtual Machines (JVMs)
    http://docs.oracle.com/cd/E15523_01/web.1111/e13814/jvm_tuning.htm


    PS
    Im pretty sure, that you will get same performance degradation with VSQL if you will call it from JAVA, just because its JAVA. For example memory defined per JVM, all resources controlled by JVM - heap size, stack size, etc.

    If you are filling degradation on extract, so probably JVM doesn't configured well and this is a problem - you have to investigate it, otherwise you always will get a performance degrades with JAVA.

  • Options
    Hi ,

    As mention by Daniel , C++ implementation (VSQL) will probably be faster , we see it also in other databases like Oracle . However ,  i will try to check what is the root cause of your extract  degradation   after 1M records , it may related to GC of java  or maybe it’s something related to the time you span for building the records (pad the delimiter )  did you test your performance without adding the delimiter char to your records ?  ,  you can also try to optimize your statement  setFetchSize  attribute , but in general  you should not see such degradation .


    Check your  code with some kind of java profiler before jump to the use of VSQL from your java code ( its ugly  :) ) 

    Thanks 


Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file