We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


COPY data into 2 Tables — Vertica Forum

COPY data into 2 Tables

verbanverban Vertica Customer

Hi,

I am loading 1 CSV file having the below format to 2 different tables.
(This is because of different data retention policy)

CSV File Format: (date, ip, port, data)
2017-01-02 20:11:43 10.11.12.13 1234 3bc548852b806d39c88c7b4a6b
2017-01-02 20:11:44 10.11.11.15 1234 93997f86ddb76fbd423e812dca0ea1
2017-01-02 20:11:45 10.11.12.17 1234 35b73412c2ae80c65723bdd67248558c34b80

Tables structures:

Table_A:
id,date,ip,port,hash(data)

Table_B
id2,date,data

The question is how to copy CSV file into Table_A without store 'data' field, i need only hash(data).

Scenario:
'data' field has up to 65K length, so it use more than 90% of disk space. I want to store it in different tables to keep 'data' for few days and delete expired data by daily partition (Table_B).
so when i need 'data' i'll join Table_A and Table_B on 'date' and hash(data)

Comments

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Hi,

    Check out the FILLER option of the COPY command.

    See:
    https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/IgnoringColumnsAndFieldsInTheLoadFile.htm

    Example:

    In this example I added an ID to the CSV file...

    dbadmin=> CREATE TABLE table_a(id INT, d date, ip VARCHAR(100), port INT, data_hash INT);
    CREATE TABLE
    
    dbadmin=> CREATE TABLE table_b(id INT, d date, data VARCHAR(1000));
    CREATE TABLE
    
    dbadmin=> \! cat /home/dbadmin/file.csv
    1|2017-01-02 20:11:43|10.11.12.13|1234|3bc548852b806d39c88c7b4a6b
    2|2017-01-02 20:11:44|10.11.11.15|1234|93997f86ddb76fbd423e812dca0ea1
    3|2017-01-02 20:11:45|10.11.12.17|1234|35b73412c2ae80c65723bdd67248558c34b80
    
    dbadmin=> COPY table_a (id, d, ip, port, data_hash_f FILLER VARCHAR, data_hash AS HASH(data_hash_f)) FROM '/home/dbadmin/file.csv' DIRECT;
     Rows Loaded
    -------------
               3
    (1 row)
                                                                 ^
    dbadmin=> COPY table_b (id, d, ip_f FILLER VARCHAR(100), port_f FILLER INT, data) FROM '/home/dbadmin/file.csv' DIRECT;
     Rows Loaded
    -------------
               3
    (1 row)
    
    dbadmin=> SELECT * FROM table_a;
     id |     d      |     ip      | port |      data_hash
    ----+------------+-------------+------+---------------------
      1 | 2017-01-02 | 10.11.12.13 | 1234 | 1478321945145517140
      2 | 2017-01-02 | 10.11.11.15 | 1234 | 3787372996188421577
      3 | 2017-01-02 | 10.11.12.17 | 1234 | 1171497616496332802
    (3 rows)
    
    dbadmin=> SELECT * FROM table_b;
     id |     d      |                 data
    ----+------------+---------------------------------------
      1 | 2017-01-02 | 3bc548852b806d39c88c7b4a6b
      2 | 2017-01-02 | 93997f86ddb76fbd423e812dca0ea1
      3 | 2017-01-02 | 35b73412c2ae80c65723bdd67248558c34b80
    (3 rows)
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file