Vertica 7.2 - How to identify file associated with a Projection

Hi ,


   I am using VM Ware image Vertica 7.2, I am trying to find out the file for a speicific projection. I used the approach of using following meta data table mentioned in the link


select storage_type, storage_oid from v_monitor.storage_containers where projection_name = 'test1_node0001';


   This logic does not look like valid any more after Vertica 7.X version. Can you please share the correct logic for Vertica 7.2 , I am seeing .gt file instead of fdb or ipdx files. Can some one explain the logic of gt file also.





  • Hi


    Starting 7.2, Vertica uses a new layout. It got rid of second level of the directory hierarchy as well as reduced the number of files


    dbadmin=> select projection_id,projection_name,storage_type,storage_oid,sal_storage_id from storage_containers where projection_name = 't1_b0' and node_name = 'v_v720_node0001';
       projection_id   | projection_name | storage_type |    storage_oid    |                  sal_storage_id
     45035996273909840 | t1_b0           | ROS          | 45035996273909931 | 02129e327df2e7b8652c6a93107cde9600a00000000320ab
    (1 row)





    Hope this helps.




  • Hi Gayathri,/Team Thanks for nice explanation, Now I am able to see the Projection and corresponding physical file. With this I need one more clarification. I heard that Vertica stores file like Per Column Per Projection and Per node. If my understanding is not wrong (Correct me if it is wrong) the folder 931 is nothing but ROS Container, with in this I am expecting multiple file associated with each of the column, but I am seeing only one file. Can you explain whether this single gt file contain all the data of associated with the projection or is this some thing else. Regards, V. Narayanan.
  • Yes Vertica has column oriented on disk physical data structures. There can be many storage containers per projection on a node and each of them has the same number of column files.


    Each column file used to have two files, fdb and pidx, but now we store them together in a gt file.


    Moreover, if the size of a data file (fdb) falls below a configurable threshold (1MB by default), then it's stored together with other small files (1MB) in that storage container.


    Note that we are not changing column oriented format, just many small column files are stored together as one gt file.


    Thus this new disk layout reduces the number of files without impacting the benifits of column orientation. 


    The query you have shows the file which has the first column data file of that storage container (however size it shows is the total size of all data files in that container).

    That storage container could have many files. You have to look at v_internal.vs_ros to figure out what are the files are. 


    => select distinct(salstorageid) from v_internal.vs_ros where delid =45035996273909931;


    Note that "931", the folder name, doesn't uniquely identifies a storage container, it's the folder we chose based on part of it's storage id, but there can be many other storage container files stored under that directory.


  • Hi Tharanga/Gayathri, Thanks for detailed explanation , Now I got the point :). For Some reason (Webpage says Authentication failure) I am not able to treat this Accept as solution. You can very well treat your input as accepted solution from my end. Thanks for your timely help.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file