We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


EXPORT TO PARQUET - Specify the Node Data Will be Written To — Vertica Forum

EXPORT TO PARQUET - Specify the Node Data Will be Written To

I have 3 node Vertica cluster. I would like to export data to local folder on specific node from the three node cluster.
How can I do it without creating a mapping on the other nodes?

Currently my command is:

EXPORT TO PARQUET (directory = '/data/export_vertica_tables/a/') AS SELECT b from imp.a;

Tagged:

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited October 2021

    Try using an empty OVER() clause.

    EXPORT TO PARQUET (directory = '/data/export_vertica_tables/a/') OVER() AS SELECT b from imp.a;

    See: Partitioning and Sorting Data

    It says:

    If you are exporting data to a local file system, you might want to force a single node to write all of the files. To do so, use an empty OVER clause.

  • Thanks for the reply. To better understand, I would appreciate getting input on the following:

    1) When I used the empty OVER() I saw that the directory was also created on the other nodes but was empty. Why this empty directory is created?

    2) I ran the command on node1, but the data was exported to node2. Can I control to which node the data will be written?

    3) Is OVER() also the default if I omit the OVER all together.

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Hi,

    Which version of Vertica are you running? Note that Vertica 11.0 is the first version to officially support exporting to the Linux File System.

    See: Export to Linux File System

    It says:

    When exporting data, you can now write to the node-local Linux file system. Previously, exports to Linux were only supported for an NFS mount.

    Anyway, there doesn't appear to be a way to just write to one node...

    Below are some quick tests using the DB view as described here Monitoring Exports to show the rows written on each node...

    dbadmin=> SELECT sysdate;
              sysdate
    ----------------------------
     2021-10-25 11:15:54.592678
    (1 row)
    
    dbadmin=> EXPORT TO PARQUET(directory='/home/dbadmin/big_empty_over100') OVER () AS SELECT * FROM big;
     Rows Exported
    ---------------
             65536
    (1 row)
    
    dbadmin=> SELECT node_name, SUM(rows) rows FROM parquet_export_events WHERE report_time > '10/25/2021 11:15:54' GROUP BY node_name ORDER BY node_name;
         node_name      | rows
    --------------------+-------
     v_test_db_node0001 | 21722
     v_test_db_node0002 | 21929
     v_test_db_node0003 | 21885
    (3 rows)
    
    dbadmin=> SELECT sysdate;
              sysdate
    ----------------------------
     2021-10-25 11:17:03.680264
    (1 row)
    
    dbadmin=> EXPORT TO PARQUET(directory='/home/dbadmin/big_no_over100') AS SELECT * FROM big;
     Rows Exported
    ---------------
             65536
    (1 row)
    
    dbadmin=> SELECT node_name, SUM(rows) rows FROM parquet_export_events WHERE report_time > '10/25/2021 11:17:03' GROUP BY node_name ORDER BY node_name;
         node_name      | rows
    --------------------+-------
     v_test_db_node0001 | 21722
     v_test_db_node0002 | 21929
     v_test_db_node0003 | 21885
    (3 rows)
    
    dbadmin=> SELECT sysdate;
              sysdate
    ----------------------------
     2021-10-25 11:17:50.409652
    (1 row)
    
    dbadmin=> EXPORT TO PARQUET(directory='/home/dbadmin/big_over100') OVER(PARTITION BY c1) AS SELECT * FROM big;
     Rows Exported
    ---------------
             65536
    (1 row)
    
    dbadmin=> SELECT node_name, SUM(rows) rows FROM parquet_export_events WHERE report_time > '10/25/2021 11:17:50' GROUP BY node_name ORDER BY node_name;
         node_name      | rows
    --------------------+-------
     v_test_db_node0001 | 19865
     v_test_db_node0002 | 15592
     v_test_db_node0003 | 18346
    (3 rows)
    
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file