Finding All Columns Storing Some Value
A long time ago a database developer asked me how she could produce a list of all table columns in the database which store a given value. Note that she was in the process of normalizing tables and wanted to find every fact table that referenced a particular string value to replace it with a dimension table look up. The solution we found in that database (i.e. Oracle) is also relevant today, only we get results much, much faster in Vertica!
I want to list all of the table columns in my database that contain at least one record equal to the text “Pennsylvania”. To do that, in vsql I can query the V_CATALOG.COLUMNS system table and use it to build SQL SELECT statements, one for every column in the database, that check the table column for a value that equals “Pennsylvania”. Once I’ve built those SQL statements, I’ll pipe them back into vsql to run.
The output is displayed as a list where each record has the format:
SCHEMA NAME.TABLE NAME.COLUMN NAME
dbadmin=> \! vsql -Atc "SELECT 'SELECT ''' || table_schema || '.' || table_name || '.' || column_name || ''' FROM dual WHERE EXISTS (SELECT NULL FROM ' || table_schema || '.' || table_name || ' WHERE \"' || column_name || '\"::' || CASE WHEN data_type_id IN (17, 115, 116, 117) THEN data_type ELSE 'VARCHAR' END || ' = ''Pennsylvania'');' FROM columns WHERE NOT is_system_table ORDER BY table_schema, table_name, ordinal_position;" | vsql -At local.state_dimension.state main_fact.sales.state public.states.name