COPY data into 2 Tables

verbanverban Vertica Customer

Hi,

I am loading 1 CSV file having the below format to 2 different tables.
(This is because of different data retention policy)

CSV File Format: (date, ip, port, data)
2017-01-02 20:11:43 10.11.12.13 1234 3bc548852b806d39c88c7b4a6b
2017-01-02 20:11:44 10.11.11.15 1234 93997f86ddb76fbd423e812dca0ea1
2017-01-02 20:11:45 10.11.12.17 1234 35b73412c2ae80c65723bdd67248558c34b80

Tables structures:

Table_A:
id,date,ip,port,hash(data)

Table_B
id2,date,data

The question is how to copy CSV file into Table_A without store 'data' field, i need only hash(data).

Scenario:
'data' field has up to 65K length, so it use more than 90% of disk space. I want to store it in different tables to keep 'data' for few days and delete expired data by daily partition (Table_B).
so when i need 'data' i'll join Table_A and Table_B on 'date' and hash(data)

Comments

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Hi,

    Check out the FILLER option of the COPY command.

    See:
    https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/BulkLoadCOPY/IgnoringColumnsAndFieldsInTheLoadFile.htm

    Example:

    In this example I added an ID to the CSV file...

    dbadmin=> CREATE TABLE table_a(id INT, d date, ip VARCHAR(100), port INT, data_hash INT);
    CREATE TABLE
    
    dbadmin=> CREATE TABLE table_b(id INT, d date, data VARCHAR(1000));
    CREATE TABLE
    
    dbadmin=> \! cat /home/dbadmin/file.csv
    1|2017-01-02 20:11:43|10.11.12.13|1234|3bc548852b806d39c88c7b4a6b
    2|2017-01-02 20:11:44|10.11.11.15|1234|93997f86ddb76fbd423e812dca0ea1
    3|2017-01-02 20:11:45|10.11.12.17|1234|35b73412c2ae80c65723bdd67248558c34b80
    
    dbadmin=> COPY table_a (id, d, ip, port, data_hash_f FILLER VARCHAR, data_hash AS HASH(data_hash_f)) FROM '/home/dbadmin/file.csv' DIRECT;
     Rows Loaded
    -------------
               3
    (1 row)
                                                                 ^
    dbadmin=> COPY table_b (id, d, ip_f FILLER VARCHAR(100), port_f FILLER INT, data) FROM '/home/dbadmin/file.csv' DIRECT;
     Rows Loaded
    -------------
               3
    (1 row)
    
    dbadmin=> SELECT * FROM table_a;
     id |     d      |     ip      | port |      data_hash
    ----+------------+-------------+------+---------------------
      1 | 2017-01-02 | 10.11.12.13 | 1234 | 1478321945145517140
      2 | 2017-01-02 | 10.11.11.15 | 1234 | 3787372996188421577
      3 | 2017-01-02 | 10.11.12.17 | 1234 | 1171497616496332802
    (3 rows)
    
    dbadmin=> SELECT * FROM table_b;
     id |     d      |                 data
    ----+------------+---------------------------------------
      1 | 2017-01-02 | 3bc548852b806d39c88c7b4a6b
      2 | 2017-01-02 | 93997f86ddb76fbd423e812dca0ea1
      3 | 2017-01-02 | 35b73412c2ae80c65723bdd67248558c34b80
    (3 rows)
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file