Best practice for Data Aging/ Deleting old data
Vertica Customer ✭
What is the best practice to clean up old data automatically? In our design, we have defined large facts tables partitioned by a day id. Then we have scheduled a cron job that is scheduled to run once a day - to delete the old partition's data that are being aged out. However, at the same time, we also want to monitor the disk storage and drop partition accordingly, what could be the best strategy to do it?
Can we make use of DISK_STORAGE table to achieve the same?
The best practice is to schedule via cron the use of DROP_PARTITIONS function to drop old partitions as you do.
There might be a gap between the time you drop partitions for this to be reflected in DISK_STORAGE.
For up-to-date status check of disk usage of files and directories use the Linux commands df -h
or du -sh /file path