How to find all statements executed by any user

QUESTION:
I would like to view all SQL statements executed by any database user.  

SOLUTION:
You can get user query type information from one or a combination of sources, from the most to least detailed: 

Log files: You can find information about statements executed by users in the vertica.log files. Be aware that the log files have limitations. 
  • The BQuery and Executing Statement log entries for JDBC were eliminated when the drivers were rewritten to comply with industry standards, and the remaining log entries do not show the parameterize values, instead showing ? marks.  
  • Any node could be the initiator so you'd have to gather logs from all nodes to find all the operations and their corresponding commit or rollbacks. 
  • If the copy source is STDIN or a file, the values copied aren't visible anywhere. 
$ grep "dctest" vertica.log 
2013-02-25 10:46:49.174 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b6ff 'create table dctest (c1 int, c2 varchar(20));' 
2013-02-25 10:48:34.089 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b708 'alter table dctest add column c3 timestamp;' 
2013-02-25 10:48:34.123 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b709 'alter table dctest add column c3 timestamp;' 
2013-02-25 10:50:17.150 DistCall Dispatch:0x2aaab400e4d0 [Txn] <INFO> Starting Commit: Txn: a000000001b710 'CREATE PROJECTION public.dctest AS SELECT * FROM public.dctest ORDER BY c1,c2,c3 UNSEGMENTED NODE v_fico_node0001 KSAFE;' 
2013-02-25 10:50:17.181 DistCall Dispatch:0x2aaab400e4d0 [Txn] <INFO> Starting Commit: Txn: a000000001b711 'CREATE PROJECTION public.dctest AS SELECT * FROM public.dctest ORDER BY c1,c2,c3 UNSEGMENTED NODE v_fico_node0001 KSAFE;' 
2013-02-25 10:50:19.802 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b70f 'insert /* +direct */ into dctest values('1','one','2013-02-01 10:00:00');' 
2013-02-25 10:51:10.821 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b713 'copy dctest from stdin;' 
2013-02-25 10:55:56.048 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b718 'select * from dc_requests_issued where table_name='dctest' order by time;' 
2013-02-25 10:55:56.076 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Starting Commit: Txn: a000000001b724 'drop table dctest cascade;' 
2013-02-25 11:02:44.708 DistCall Dispatch:0x2aaab401cf00 [Txn] <INFO> Rollback Txn: a000000001b725 'select * from dc_requests_issued where request ilike '%dctest%' order by time;' 
2013-02-25 11:19:11.693 DistCall Dispatch:0x2aaab401cf40 [Txn] <INFO> Rollback Txn: a000000001b747 'select time,user_name,transaction_id,request_type,request from dc_requests_issued where request ilike '%dctest%' order by time;' 

Log files are useful if, for example, you want to find out why someone added table rows but those rows did not appear in the table later. In this scenario, the user might have inserted rows without a commit. Alternatively, a resource rejection might have occurred and the user was unaware. The vertica.log lets you look for all operations related to a specific table, not limited to the user. 

Data Collector: The DC table dc_requests_issued returns information (such as user id and request type) through the following columns:

=> \x
Expanded display is on.
time              | 2013-06-17 12:26:46.134382-04
node_name         | v_merge1_node0001
session_id        | doca01.verticacorp.-31158:0xc
user_id           | 45035996273704962
user_name         | dbadmin
transaction_id    | 0
statement_id      | -1
request_id        | 1
request_type      | UTILITY
label             |
search_path       | "$user", public, v_catalog, v_monitor, v_internal
query_start_epoch | 1
request           | select install_license('/opt/vertica/config/share/license.key');
is_retry          | f

The following query asks for details on create, alter, insert, copy, and drop operations on specific tables. You can filter further on the request_type column to view whether the type was QUERY, DDL, LOAD, UTILITY, TRANSACTION, PREPARE, EXECUTE, SET, or SHOW. 

=> SELECT time,user_name,transaction_id,request_type,request FROM dc_requests_issued WHERE request ILIKE '%dctest%' ORDER BY time; 

time | user_name | transaction_id | request_type | request 
-------------------------------+-----------+-------------------+--------------+--------------------------------------------------------------------------------------------------------------------------------- 
2013-02-25 10:46:49.157858-05 | dbadmin | 45035996273817343 | DDL | create table dctest (c1 int, c2 varchar(20)); 
2013-02-25 10:48:34.076438-05 | dbadmin | 45035996273817352 | DDL | alter table dctest add column c3 timestamp; 
2013-02-25 10:50:17.14508-05 | dbadmin | 45035996273817360 | DDL | CREATE PROJECTION public.dctest AS SELECT * FROM public.dctest ORDER BY c1,c2,c3 UNSEGMENTED NODE v_fico_node0001 KSAFE; 
2013-02-25 10:50:17.197863-05 | dbadmin | 45035996273817359 | QUERY | insert /* +direct */ into dctest values('1','one','2013-02-01 10:00:00'); 
2013-02-25 10:50:37.613245-05 | dbadmin | 45035996273817363 | LOAD | copy dctest from stdin; 
2013-02-25 10:53:33.16791-05 | dbadmin | 45035996273817368 | QUERY | select * from dc_requests_issued where request ilike '%dctest%' order by time; 
2013-02-25 10:55:56.05466-05 | dbadmin | 45035996273817380 | DDL | drop table dctest cascade; 
2013-02-25 10:57:05.423657-05 | dbadmin | 45035996273817381 | QUERY | select time,user_name,transaction_id,request_type,request from dc_requests_issued where request ilike '%dctest%' order by time; 
(9 rows) 

The dc_requests_issued table has some limitations:
  • Data Collector tables retain information for a limited amount of time. When data reaches its storage limit, Vertica purges old data as new data comes in. You can specify how much memory and/or disk space to allocate for retained data by configuring an adjustable retention policy, but ultimately, system activity dictates the frequency of updates and purges, so you might have to query this table frequently. Typically, DC tables have one or more raw log files, per table, under the catalog directory DataCollector. You could periodically check for a new file and copy it somewhere on your system for later analysis. This table should have RequestsIssued_<oidnumber>.log; for example: 
    [[email protected] DataCollector]$ ls 
    RequestsIssued*.log 
    RequestsIssued_408689940118692.log 
  • The dc_requests_issued table returns requests issued, but it doesn't mean those requests were successful. For the full picture, also query the dc_requests_completed table, which retains data about successful requests. Note that dc_requests_completed reports only the time, session id, and user id/name, so you'll want to consider writing a subquery or join on the two DC tables in vsql. Alternatively, you could write a parser if you are using the log files. 
  • The COPY command does not keep track of values that have been copied. 

Management Console: At a higher level of detail, you can monitor system activity through Management Console (MC). On the bottom right of the Database Overview page, a "User Query Type Distribution" chart provides an overview of user and system query activity and reports the type of operation that ran. Double click any point in the graph to open the Activity page, where you can toggle the chart between query concurrency, internal sessions (e.g., Tuple Mover and rebalance cluster operations), and user sessions for all user connections open to MC.

For more information, see the following topics/sections in the 6.1 HP Vertica documentation:

Comments

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file