Options

Heuristic to use to decide between Compressed Common Delta encoding and RLE encoding

I've noticed that the Database Designer has assigned both 'RLE' and 'Compressed Common Delta' encodings to integer key columns of dimension tables in my dimensional model (standard Kimball Star Schema design). When reviewing the columns which have been made 'COMMONDELTA_COMP' rather than RLE, and their use in supplied queries, I can't understand why the choice to use one over the other has been made. For instance, whilst in most cases for columns where every value is distinct, and all values form an ordered sequence of natural numbers, the compressed common delta encoding is used, in some cases RLE encoding is applied - although maximum length of any run must be 1.

Can you supply a heuristic that can be used when determining which encoding type to use for a column. In particular, when choosing encodings for dimension attribute key columns? Currently I am assigning COMMONDELTA_COMP to leaf level keys (where every value is unique), and RLE to the keys of attributes that are not at leaf level (so where repeated values exist).

Does the choice of encoding effect the choice of ordering?

Additionally, if there is any detailed information available on choosing encoding types in Vertica then links to this would be appreciated.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file