Improve Disk Storage & Table Partitions
Hi,
I am new to this Vertica Project.
We have a cluster with 36 nodes and the free disk space in all the nodes are less than 20%.
I am trying to analyze all tables and its storage, would like to recommend some solutions.
Noticed following issues,
1. We have some data that gets uploaded to TABLE_DAILY on a daily basis, same data is summarized and stored in TABLE_WEEKLY & TABLE_MONTHLY (monthly summary).
TABLE_DAILY is partitioned by DATE (Occupies 10TB data)
TABLE_WEEKLY is partitioned by WEEK (Occupies 10TB data)
TABLE_MONTHLY is partitioned by MONTH (Occupies 10TB data)
WEEKLY & MONTHLY tables are created to improve the performance when tableau connects to pull data from Vertica.
Can I keep only one table TABLE_DAILY which is partitioned by DATE, WEEK & MONTH so that it improves performance when Tableau connects to Vertica and can save more space. Can you suggest some options please.
- We have a retention policy, to keep data in Vertica for 3 years. Can we create a Storage Policy and move all data which are older than 3 years?
Could you please suggest some options to improve some space in our DB.
Thanks in Advance.
Comments
Number 1:
For TABLE_DAILY, what do you mean by DATE? Like 11/13/2017, 11/14/2017, 11/15/2017, etc.? Or do you mean DAY (i.e. 0, 1, 2, 3, 4, 5, 6, 7)? If its by date than I'm sure you'd exceed the 1,024 ROS container limit as each partition in stored in its own container.
Vertica 9 has a great feature called "Hierarchical Partitioning" that could help you reduced the number of ROS containers used.
See:
https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/AdministratorsGuide/Partitions/HierarchicalPartitioning.htm
Number 2:
You can create a storage policies for less often queried data partitions (i.e. older dates).
See:
https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/AdministratorsGuide/StorageLocations/CreatingStoragePoliciesForOldData.htm
Thanks Jim.
**Number 1 **
TABLE_DAILY is partitioned by - The day of the year (1–365/366).
Could you please let me know whether can we have below 3 partitions in one table,
1. DOY
2. Year53 + Week
3. Year12 + Month
Certainly you can create those partitions, but I'd recommend against it. Take a peek at the following links:
https://my.vertica.com/kb/HPE-Vertica-Partitions-The-FAQs/Content/FAQs/HPE-Vertica-Partitions-The-FAQs.htm
https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/AdministratorsGuide/Partitions/BestPractices.htm
I'd highly recommend looking at "Hierarchical Partitioning" if you can!
https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/AdministratorsGuide/Partitions/BestPractices.htm
Thanks Jim. I will take a look.
You should check if the projections have a right encoding. Maybe just changing the encoding will reduce the storage footprint. You can see encoding in the projection_column table.
As Jim mentions take a look at Hierarchical partition and that can help to reduce the storage print taking advantage of the encoding.
https://my.vertica.com/docs/9.0.x/HTML/index.htm#Authoring/AdministratorsGuide/Partitions/HierarchicalPartitioning.htm?Highlight=hierarchy
Hi Emoreno,
1.Columns are encoded as RLE or AUTO. Would like to use DESIGNER_DESIGN_PROJECTION_ENCODINGS and take a look at the script for the suggested column encodings in the biggest table.
I ran the DESIGNER_DESIGN_PROJECTION_ENCODINGS, and looks like the table columns are encoded as per the recommendations. I don't see much difference in the encoding type between existing & recommended.
Is it possible to try BZIP / GZIP but it is not recommended(by designer) though in any of the columns.
Do you have any update on above post