The Vertica Forum recently got a makeover! Let us know what you think by filling out this short, anonymous survey.
Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

Total Table Row Count Report by Schema

Jim_KnicelyJim_Knicely - Select Field - Administrator
edited June 2019 in Tips from the Team

I had a colleague ask today how he could generate a report showing the total table row counts by schema. This should be easy, right? After all the PROJECTION_STORAGE system table has a ROW_COUNT column. But it turns out it’s a little complex considering there are segmented and un-segmented projections in Vertica.

We’ll need to join the TABLES, PROJECTIONS and PROJECTION_STORAGE system tables together, computing the row counts a little differently based on the projection type.


dbadmin=> SELECT table_schema,
dbadmin->        SUM(row_count) AS row_count
dbadmin->   FROM (SELECT table_schema,
dbadmin(>                NVL(CASE
dbadmin(>                      WHEN NOT is_segmented THEN (row_count / node_cnt)::INT
dbadmin(>                      ELSE row_count
dbadmin(>                    END, 0) row_count
dbadmin(>           FROM (SELECT t.table_schema,
dbadmin(>                        t.table_name,
dbadmin(>                        ps.projection_name,
dbadmin(>                        (SELECT MIN(is_segmented)
dbadmin(>                           FROM projections p
dbadmin(>                          WHERE p.projection_name = ps.projection_name
dbadmin(>                            AND p.projection_schema = ps.anchor_table_schema) is_segmented,
dbadmin(>                        node_cnt,
dbadmin(>                        SUM(ps.row_count - NVL(dv.deleted_row_count, 0)) row_count
dbadmin(>                   FROM tables t
dbadmin(>                   LEFT JOIN projection_storage ps
dbadmin(>                     ON t.table_id = ps.anchor_table_id
dbadmin(>                   LEFT JOIN delete_vectors dv
dbadmin(>                     ON dv.node_name = ps.node_name
dbadmin(>                    AND dv.schema_name = ps.anchor_table_schema
dbadmin(>                    AND dv.projection_name = ps.projection_name
dbadmin(>                  CROSS JOIN (SELECT COUNT(1) node_cnt FROM nodes WHERE node_type = 'PERMANENT') n
dbadmin(>                  GROUP BY 1, 2, 3, 4, 5
dbadmin(>                  LIMIT 1 OVER(PARTITION BY t.table_schema, t.table_name ORDER BY 1)) foo) foo2
dbadmin->  GROUP BY table_schema
dbadmin->  ORDER BY table_schema;
table_schema  | row_count
ctg_analytics |         0
dds           |     39759
idea_dwh      |     12288
jim1          |        16
public        |  71007066
stlouis       |  77396606
test          |         8
(7 rows)

But I’m not comfortable with this solution. What happens when Vertica inevitably introduces new/improved projection types? We may have to modify the query to handle those.

Instead of calculating a table's row counts from its projetions, I prefer to get actual row counts from the tables. To do that, we can generate a SQL statement that literally runs a SELECT COUNT(*) from each table in the database, summing the counts by schema:

dbadmin=> \! vsql -Atc "SELECT 'SELECT table_schema, SUM(table_row_count) total_row_count FROM ('UNION ALL SELECT 'SELECT ''' || table_schema || ''' AS table_schema, (SELECT COUNT(1) FROM ' || table_schema || '.' || table_name || ') AS table_row_count' || DECODE(LEAD(table_name) OVER (ORDER BY 1), NULL, ') foo GROUP BY table_schema ORDER BY table_schema;', ' UNION ALL') FROM tables WHERE table_definition = '';" | vsql
table_schema  | total_row_count
ctg_analytics |               0
dds           |           39759
idea_dwh      |           12288
jim1          |              16
public        |        71007066
stlouis       |        77396606
test          |               8
(7 rows)

This works as Vertica is pretty darn quick at performing table row counts!

Helpful Links:

Have fun!

Sign In or Register to comment.