Disk space during projection refresh

Currently unable to refresh a projection as it keeps running out of disk space. How much space is needed to refresh a projection? On each node (3 node) the projection as compressed on disk is about 100GB, and each node has about 1TB free, yet refreshing that projection keeps using disk space until the 1TB per node is consumed and it fails.

 

Is 10x expansion of data during a refresh normal? How can we ever do a refresh without massively overprovisioning disk if so?

 

I also notice that Vertica is trying to refresh both buddy projections at the same time. Can we manually refresh one then the other? Maybe that will squeeze through.

Comments

  • HP recomends to leave as least 40% of the whole disk storage

  • There is more than 90% free, but that didn't seem to be enough. In the end we just provisioned more storage. It took around 15x more free to refresh the projection (1.5TB free space for every 100GB stored on disk before the refresh). That seems rather high, but it worked.

  • If you have wide varchar columns, they will cause an explosion in temp space usage during projection refresh (and a performance hit during query execution).  If you don't need them to be so wide, then alter the columns to be narrower.

     

      --Sharon

     

  • We have quite a few wide varchar's. I tested this after importing to a more compact schema and it made a huge difference (around 10x). Huge improvement. Thanks.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file