Vertica Column Statistics : Profiling Data
Hi,
I need to fetch -
1- Top 100 Distinct Values for Each Column (Including Min & Max) .
I'm interested in findng out - as a result of Analyze statistics/ Histogram does vertica stored such information any where? If yes then which system table/view?
If such information isn't stored by Default by vertica is there any less tedius way to write such queries to collect statitics/ data profiling on columns ?
0
Comments
Hi ,
You basically have two main approch to get this info
Thanks
Thanks. Output of Export Statistics has information I need.
I had created queries using this sql and stored them in File. And Execute these queries to Store data in a Table with Another 'INSERT INTO XXXX ' appended around ..
Another QQ, Which Table information does EXPORT_STATISTICS function might be querying at.?
Column_Storage doesn't seem to store MAX, MIN values. ! And other solution you spoke about- Querying some Histogram storage table.. I couldn't find the table.. can you please write exact Name .. Thanks
SELECT DISTINCT '
|| ''''||TABLE_NAME||''''||' AS TABLE_NAME ,'
||''''||COLUMN_NAME||''''||' AS COLUMN_NAME ,'
||COLUMN_NAME||' AS COLUMN_VAL ,
MIN('||COLUMN_NAME||') OVER () AS MIN_VAL,
MAX('||COLUMN_NAME||
') OVER () AS MAX_VAL
FROM
CHIM_BDI.'||TABLE_NAME||'
ORDER BY '||COLUMN_NAME||
' DESC LIMIT 10 ) ;
COMMIT ; '
FROM
V_CATALOG.COLUMNS
WHERE
TABLE_SCHEMA='xxxxx'
AND TABLE_NAME LIKE 'xxxx'
AND COLUMN_NAME NOT LIKE 'xxx%'
AND COLUMN_NAME <>'xxx' AND DATA_TYPE NOT LIKE 'long varchar%' ;
Hi ,
The internal implementation of EXPORT_STATISTICS is not publish .
However Vertica probably querying the table i send you , in the previous replay (you need dbadmin previlage in order to access its content ) .
For Min & MAx , Vertica maintain it differently from what you probably know that exists on other DB products , in Vertica ( like Casandra ) the Min Max is managed on the level of the ROS file , to see this info (still you need dbadmin previlage in order to access its content) you can query select * from vs_ros_min_max_values
I hope this answers all your questions
Thanks