Analyze Statistics at the Schema Level (Part 2): Quick Tip

[Deleted User][Deleted User] Administrator
edited June 2018 in Tips from the Team

Jim Knicely authored this tip.

The ANALYZE_STATISTICS function only accepts a table/projection/column name as input. In yesterday’s Vertica Quick Tip, you learned how to get Vertica to generate and execute ANALYZE_STATISTICS SQL statements, one for each table in a given schema.

It was an okay solution, but not very convenient. A better option is to create your own function (external procedure) that does all of the work for you.

**Example: **

Here is the code for the procedure (a simple BASH script):

dbadmin=> \! cat /home/dbadmin/sfdc/procedures/analyze_schema_statistics.sh
#!/bin/bash

# Simple external procedure to run analyze_statistics function at the schema level
# This procedure accepts one parameter (a schema name)

source ~/.bashrc

/opt/vertica/bin/vsql -Atc "SELECT 'SELECT analyze_statistics(''' || table_schema || '.' || table_name || ''');' FROM v_catalog.tables WHERE table_schema = '$1';" | /opt/vertica/bin/vsql -e > /home/dbadmin/analyze_schema_statistics.log 2>&1

exit 0

First, create the procedure in Vertica:

dbadmin=> CREATE PROCEDURE analyze_schema_statistics (schema_name VARCHAR) AS 'analyze_schema_statistics.sh' LANGUAGE 'external' USER 'dbadmin';
CREATE PROCEDURE

Next, find a schema to run it on:

dbadmin=> \dt test_stats.*
                List of tables
   Schema   | Name | Kind  |  Owner  | Comment
------------+------+-------+---------+---------
 test_stats | t1   | table | dbadmin |
 test_stats | t2   | table | dbadmin |
 test_stats | t3   | table | dbadmin |
(3 rows)

Note that there are three tables in the TEST_STATS schema and all their projections have no statistics:

dbadmin=> SELECT projection_name, anchor_table_name, has_statistics FROM projections WHERE projection_schema = 'test_stats';
 projection_name | anchor_table_name | has_statistics
-----------------+-------------------+----------------
 t1_super        | t1                | f
 t2_super        | t2                | f
 t3_super        | t3                | f
(3 rows)

Next, run the new procedure:

dbadmin=> SELECT analyze_schema_statistics('test_stats');
analyze_schema_statistics
---------------------------
                         0
(1 row)

Now all of the projections have statistics!

dbadmin=> SELECT projection_name, anchor_table_name, has_statistics FROM projections WHERE projection_schema = 'test_stats';
 projection_name | anchor_table_name | has_statistics
-----------------+-------------------+----------------
 t1_super        | t1                | t
 t2_super        | t2                | t
 t3_super        | t3                | t
 (3 rows)

Oh, the procedure also wrote out to a log file. Here are the contents of that log file:

dbadmin=> \! cat /home/dbadmin/analyze_schema_statistics.log
SELECT analyze_statistics('test_stats.t1');
analyze_statistics
--------------------
                  0
(1 row)

SELECT analyze_statistics('test_stats.t2');
analyze_statistics
--------------------
                  0
(1 row)

SELECT analyze_statistics('test_stats.t3');
analyze_statistics
--------------------
                  0
(1 row)

Have Fun!

Sign In or Register to comment.