Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

using EXCEPT to compare 2 tables on 2 different vertica db's.

Hi Guys!
I have a tricky question.I need to compare 2 tables (1 Billion rows tables) .Should be very easy when they are on the same db- just use the except sql command but
my situation is that they are not on the same db.they are both Vertica software and lets say both db's are imported on MC .Is there some way to do that?
both db's have 1 node.

many thanks!
Keren

Answers

  • Some more info I forgot to mention which makes the issue more difficult I guess:
    -- those 2 db's are cloned from the past so they have the same db name
    -- both are single node with localhost
    I thought of doing CONNECT TO VERTICA command to connect from one to the other but I think it is not possible I didn't succeed
    because of this architecture.
    best regards,
    Keren

  • Jim_KnicelyJim_Knicely Administrator
    edited April 30

    Are you expeting a lot of differences? Maybe a Linux diff could help?

    Example:

    [[email protected] ~]$ vsql -h host1 -Atc "SELECT * FROM some_table1;"
    1
    2
    
    [[email protected] ~]$ vsql -h host12 -Atc "SELECT * FROM some_table2;"
    1
    3
    
    [[email protected] ~]$ diff --color <(vsql -h host1 -Atc "SELECT * FROM some_table1;") <(vsql -h host2 -Atc "SELECT * FROM some_table2;")
    2c2
    < 2
    ---
    
    
      3
    
    
    
  • first thanks for answering.
    Second - I know the exact difference of rows since when I count the rows on the tables I have exactly 192,448 rows more on one db . Can I still use the above command? BTW they are not on the same boxes.each db on a different ubuntu box.
    I need to put those 192,448 rows in a file and use the COPY command to insert it to the missing rows db.
    will that work?
    Keren

  • doing select * is a lot since the table has more than 1B rows...

  • Jim_KnicelyJim_Knicely Administrator
    edited April 30

    Can you reach one host from the other? Why not just load one of the tables temporaily into the other DB and do a compare with EXCEPT or MINUS?

    Something like:

    vsql -h remote_host -c "CREATE TABLE some_table1_temp LIKE some_table1;"
    vsql -h local_host -Atc "SELECT * FROM some_table1;" | vsql -h remote_host -c "COPY some_table1_temp FROM STDIN;"
    

    Unfortunately that COPY might take a while...

    Another option might to export the data as Parquet and either read it as an external table on the remote server or load it on the remote server.

  • SergeBSergeB Employee
    edited May 1

    If loading both tables in the same DB is an option, you could use COPY FROM VERTICA to import the table (with 1B rows it should be faster than COPY from STDIN ). Then use EXCEPT or MINUS as suggested by Jim.
    https://www.vertica.com/docs/10.1.x/HTML/Content/Authoring/SQLReferenceManual/Statements/COPYFROMVERTICA.htm

    => CONNECT TO VERTICA vmart USER dbadmin PASSWORD 'myPassword' ON 'VertTest01',5433;
    CONNECT
    => COPY customer_dimension2 FROM VERTICA vmart.customer_dimension;

    Rows Loaded

      500000
    

    (1 row)
    => DISCONNECT vmart;
    DISCONNECT

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.