Options

Vertica boolean storage size over int

I'm curious as to why an int takes less storage than a boolean:
CREATE TABLE tbl (    b boolean,    i int  );    INSERT INTO tbl (b, i) VALUES (true, 1);  INSERT INTO tbl (b, i) VALUES (false, 0);  INSERT INTO tbl (b, i) VALUES (false, 0);  INSERT INTO tbl (b, i) VALUES (true, 1);  INSERT INTO tbl (b, i) VALUES (false, 0);
After moving those to ROS, the storage looks like:
SELECT column_name, compressions, ros_used_bytes FROM v_monitor.column_storage WHERE anchor_table_name = 'tbl' AND column_name <> 'epoch';     column_name | compressions | ros_used_bytes  -------------+--------------+----------------   b           | lzo          |             57   i           | int delta    |              9
Changing the compression on the boolean column to BLOCKDICT_COMP, there's some savings in storage:
CREATE PROJECTION public.tbl_proj  (   b ENCODING BLOCKDICT_COMP,   i  )  AS   SELECT tbl.b,          tbl.i   FROM public.tbl   ORDER BY tbl.b  UNSEGMENTED ALL NODES;    SELECT REFRESH('public.tbl');    SELECT column_name, compressions, ros_used_bytes FROM v_monitor.column_storage WHERE anchor_table_name = 'tbl' AND column_name <> 'epoch' AND projection_name = 'tbl_proj_node0001';     column_name |   compressions   | ros_used_bytes  -------------+------------------+----------------   b           | block dictionary |             22   i           | int delta        |              9
Should int be used for true/false values?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file