We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Confirming if a Data Load was Apportioned — Vertica Forum

Confirming if a Data Load was Apportioned

Jim_KnicelyJim_Knicely - Select Field - Administrator

When running a COPY command, a parser can use more than one database node to load a single input source in parallel. This approach is referred to as apportioned load and is often much faster than loading from a single node.

One way of confirming that a data load took advantage of apportioning is to query the LOAD_SOURCES system table after running the COPY command. If the load was apportioned, you’ll get a list of nodes that participated!

Example:

[dbadmin@SE-Sandbox-43-node1 ~]$ vsql -ic "SELECT node_name, node_address FROM nodes;"
Timing is off.
    node_name    |  node_address
-----------------+----------------

v_mydb_node0001 | 192.168.61.227
v_mydb_node0002 | 192.168.61.228
v_mydb_node0003 | 192.168.61.229
(3 rows)

[dbadmin@SE-Sandbox-43-node1 ~]$ ssh 192.168.61.227 "ls -lrt /home/dbadmin/data2.txt"
-rw-r--r-- 1 dbadmin verticadba 184500036 Dec 11 22:01 /home/dbadmin/data2.txt

[dbadmin@SE-Sandbox-43-node1 ~]$ ssh 192.168.61.228 "ls -lrt /home/dbadmin/data2.txt"
-rw-r--r-- 1 dbadmin verticadba 184500036 Dec 11 22:03 /home/dbadmin/data2.txt

[dbadmin@SE-Sandbox-43-node1 ~]$ ssh 192.168.61.229 "ls -lrt /home/dbadmin/data2.txt"
-rw-r--r-- 1 dbadmin verticadba 184500036 Dec 11 22:03 /home/dbadmin/data2.txt

[dbadmin@SE-Sandbox-43-node1 ~]$ vsql -ic "COPY big_fact2 FROM '/home/dbadmin/data2.txt' ON ANY NODE; -- This load will be apportioned"
Timing is off.
Rows Loaded
-------------
     1500000
(1 row)

[dbadmin@SE-Sandbox-43-node1 ~]$ vsql -ic "SELECT node_name, source_name, input_size FROM load_sources WHERE table_name = 'big_fact2';"
Timing is off.
    node_name    |       source_name       | input_size
-----------------+-------------------------+------------
v_mydb_node0001 | /home/dbadmin/data2.txt |   30750006
v_mydb_node0001 | /home/dbadmin/data2.txt |   30750006
v_mydb_node0002 | /home/dbadmin/data2.txt |   30750006
v_mydb_node0002 | /home/dbadmin/data2.txt |   30750006
v_mydb_node0003 | /home/dbadmin/data2.txt |   30750006
v_mydb_node0003 | /home/dbadmin/data2.txt |   30750006
(6 rows)

Note that the original file size on each node is 184,500,036 bytes. Vertica split the load up into 6 pieces each 30,750,006 bytes in size.

[dbadmin@SE-Sandbox-43-node1 ~]$ vsql -ic "SELECT 184500036 / 30750006 apportioned_pieces;"
Timing is off.
  apportioned_pieces
----------------------
6.000000000000000000
(1 row)

Helpful Links:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/ExtendingVertica/UDx/UDL/ApportionedLoad.htm
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/SystemTables/MONITOR/LOAD_SOURCES.htm

Have fun!

Sign In or Register to comment.