Column encoding change on vertica 10 not work
Hello,
I try change column encoding, but it doesn't work as expected.
My test:
I have table :
openx=> select export_objects('','test.test1');
export_objects
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CREATE TABLE test.test1
(
language_id int
);
CREATE PROJECTION test.test1 /*+createtype(P)*/
(
language_id
)
AS
SELECT test1.language_id
FROM test.test1
ORDER BY test1.language_id
SEGMENTED BY hash(test1.language_id) ALL NODES KSAFE 1;
Check size
SELECT
projection_name,
column_name,
encodings,
compressions,
ROUND(SUM(used_bytes)/(1024^2),1)::FLOAT AS size_Mb
FROM
column_storage
WHERE
anchor_table_name = 'test1'
and projection_schema = 'test'
-- and projection_name = 'mnt_compiled_stats_full_new_b0'
GROUP BY
1,2,3,4
ORDER BY
5 DESC;
projection_name | column_name | encodings | compressions | size_Mb
-----------------+-------------+--------------+--------------+---------
test1_b0 | language_id | Uncompressed | int delta | 7005
test1_b1 | language_id | Uncompressed | int delta | 7005
test1_b0 | epoch | Int_Delta | none | 41
test1_b1 | epoch | Int_Delta | none | 41
Tnen, i change encoding and execute mergeout
openx=> alter table test.test1 alter column language_id encoding BLOCK_DICT projections (test.test1_b0,test.test1_b1);
WARNING 9385: New encoding will not be applied to existing projection data of this table; this change will be applied only on mergeout
ALTER TABLE
openx=> select do_tm_task('mergeout','test.test1');
do_tm_task
-----------------------------------------------------------------------------------------------------------------
Task: mergeout
(Table: test.test1) (Projection: test.test1_b0)
(Table: test.test1) (Projection: test.test1_b1)
(1 row)
Check size again - nothing changed.
projection_name column_name encodings compressions size_Mb --------------- ----------- ------------ ------------ ------- test1_b0 language_id Uncompressed int delta 7005.0 test1_b1 language_id Uncompressed int delta 7005.0 test1_b0 epoch Int_Delta none 41.0 test1_b1 epoch Int_Delta none 41.0
Vertica vertsion
openx=> SELECT version();
version
-------------------------------------
Vertica Analytic Database v10.0.0-3
0
Answers
I tried your scenario in 10.0.1, and the result was as expected.
[Before]
[After]
Please check the encoding type has been changed on your projection on exported DDL exported by the following statement.
SELECT export_objects('','test.test1');When you ran the mergeout, did you see any error message in vertica.log file?
Hello @Hibiki,
openx=> select export_objects('','test.test1'); export_objects CREATE TABLE test.test1 ( language_id int ); CREATE PROJECTION test.test1 /*+createtype(P)*/ ( language_id ENCODING BLOCK_DICT ) AS SELECT test1.language_id FROM test.test1 ORDER BY test1.language_id SEGMENTED BY hash(test1.language_id) ALL NODES KSAFE 1;DDL projection is changed, but old data not re-encoding.
No, i didn't see, mergeout worked without errrors in vertica.log
Can you check the projection definitions and the histories of mergeout by using the following queries?
openx=> SELECT projection_name, projection_basename, is_up_to_date openx-> FROM projections openx-> WHERE projection_schema = 'test' AND anchor_table_name = 'test1' openx-> ORDER BY projection_name; projection_name | projection_basename | is_up_to_date -----------------+---------------------+--------------- test1_b0 | test1 | t test1_b1 | test1 | t (2 rows) openx=> SELECT projection_name, projection_column_name, data_type, encoding_type openx-> FROM projection_columns openx-> WHERE table_schema = 'test' AND table_name = 'test1' openx-> ORDER BY projection_name, projection_column_name; projection_name | projection_column_name | data_type | encoding_type -----------------+------------------------+-----------+--------------- test1_b0 | language_id | int | BLOCK_DICT test1_b1 | language_id | int | BLOCK_DICT (2 rows) openx=> openx=> SELECT node_name, projection_name, column_name, encodings, compressions openx-> FROM column_storage openx-> WHERE anchor_table_schema = 'test' AND anchor_table_name = 'test1' openx-> ORDER BY node_name, projection_name, column_name; node_name | projection_name | column_name | encodings | compressions ------------------+-----------------+-------------+--------------+-------------- v_openx_node0001 | test1_b1 | epoch | Int_Delta | none v_openx_node0001 | test1_b1 | language_id | Uncompressed | int delta v_openx_node0004 | test1_b0 | epoch | Int_Delta | none v_openx_node0004 | test1_b0 | language_id | Uncompressed | int delta (4 rows) openx=> SELECT operation_start_timestamp, node_name, projection_name, ros_count, total_ros_used_bytes, operation_status, is_executing openx-> FROM tuple_mover_operations openx-> WHERE operation_name = 'Mergeout' AND table_schema = 'test' AND table_name = 'test1' openx-> ORDER BY operation_start_timestamp DESC; operation_start_timestamp | node_name | projection_name | ros_count | total_ros_used_bytes | operation_status | is_executing -------------------------------+------------------+-----------------+-----------+----------------------+------------------+-------------- (0 rows) openx=> SELECT mr."time", mr.node_name, pr.projection_name, mr.request_type, mr.event_type openx-> FROM dc_mergeout_requests mr openx-> JOIN projections pr ON mr.projection_oid = pr.projection_id openx-> WHERE pr.anchor_table_name = 'test1' AND pr.projection_schema = 'test' openx-> ORDER BY mr."time" DESC; time | node_name | projection_name | request_type | event_type -------------------------------+------------------+-----------------+------------------+------------------- 2020-09-25 05:20:14.45868-05 | v_openx_node0004 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.45865-05 | v_openx_node0004 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.447868-05 | v_openx_node0001 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.44785-05 | v_openx_node0001 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440361-05 | v_openx_node0006 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440342-05 | v_openx_node0006 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440057-05 | v_openx_node0005 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440032-05 | v_openx_node0005 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440029-05 | v_openx_node0007 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439886-05 | v_openx_node0007 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439718-05 | v_openx_node0003 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439687-05 | v_openx_node0003 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.43919-05 | v_openx_node0008 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439161-05 | v_openx_node0008 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.437716-05 | v_openx_node0002 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.437698-05 | v_openx_node0002 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.362344-05 | v_openx_node0004 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.362302-05 | v_openx_node0001 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.362216-05 | v_openx_node0001 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.36221-05 | v_openx_node0004 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361829-05 | v_openx_node0006 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361802-05 | v_openx_node0003 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361782-05 | v_openx_node0007 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.36178-05 | v_openx_node0005 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361763-05 | v_openx_node0008 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361717-05 | v_openx_node0006 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.3617-05 | v_openx_node0003 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361683-05 | v_openx_node0008 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361682-05 | v_openx_node0007 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361673-05 | v_openx_node0005 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.360517-05 | v_openx_node0002 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.360433-05 | v_openx_node0002 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED (32 rows)When i execute next insert in this table and run mergeout - TM re-encoding
openx=> insert /* +direct*/ into test.test1 openx-> select language_id from test.test_full limit 10000000; OUTPUT ---------- 10000000 (1 row) openx=> commit; COMMIT openx=> select do_tm_task('mergeout','test.test1'); do_tm_task ----------------------------------------------------------------------------------------------------------------- Task: mergeout (Table: test.test1) (Projection: test.test1_b0) (Table: test.test1) (Projection: test.test1_b1) (1 row) openx=> SELECT node_name, projection_name, column_name, encodings, compressions openx-> FROM column_storage openx-> WHERE anchor_table_schema = 'test' AND anchor_table_name = 'test1' openx-> ORDER BY node_name, projection_name, column_name; node_name | projection_name | column_name | encodings | compressions ------------------+-----------------+-------------+------------+-------------- v_openx_node0001 | test1_b1 | epoch | Int_Delta | none v_openx_node0001 | test1_b1 | language_id | Block_Dict | none v_openx_node0004 | test1_b0 | epoch | Int_Delta | none v_openx_node0004 | test1_b0 | language_id | Block_Dict | none (4 rows) openx=> SELECT operation_start_timestamp, node_name, projection_name, ros_count, total_ros_used_bytes, operation_status, is_executing openx-> FROM tuple_mover_operations openx-> WHERE operation_name = 'Mergeout' AND table_schema = 'test' AND table_name = 'test1' openx-> ORDER BY operation_start_timestamp DESC; operation_start_timestamp | node_name | projection_name | ros_count | total_ros_used_bytes | operation_status | is_executing -------------------------------+------------------+-----------------+-----------+----------------------+------------------+-------------- 2020-09-25 05:26:03.602269-05 | v_openx_node0001 | test1_b1 | 2 | 10183168 | Complete | f 2020-09-25 05:26:03.454343-05 | v_openx_node0004 | test1_b0 | 2 | 10183168 | Complete | f 2020-09-25 05:26:02.336911-05 | v_openx_node0004 | test1_b0 | 2 | 10183168 | Start | f 2020-09-25 05:26:02.336263-05 | v_openx_node0001 | test1_b1 | 2 | 10183168 | Start | f (4 rows) openx=> SELECT mr."time", mr.node_name, pr.projection_name, mr.request_type, mr.event_type openx-> FROM dc_mergeout_requests mr openx-> JOIN projections pr ON mr.projection_oid = pr.projection_id openx-> WHERE pr.anchor_table_name = 'test1' AND pr.projection_schema = 'test' openx-> ORDER BY mr."time" DESC; time | node_name | projection_name | request_type | event_type -------------------------------+------------------+-----------------+------------------+------------------- 2020-09-25 05:26:03.602291-05 | v_openx_node0001 | test1_b1 | MERGEOUT | REQUEST_COMPLETED 2020-09-25 05:26:03.454363-05 | v_openx_node0004 | test1_b0 | MERGEOUT | REQUEST_COMPLETED 2020-09-25 05:20:14.45868-05 | v_openx_node0004 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.45865-05 | v_openx_node0004 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.447868-05 | v_openx_node0001 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.44785-05 | v_openx_node0001 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440361-05 | v_openx_node0006 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440342-05 | v_openx_node0006 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440057-05 | v_openx_node0005 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440032-05 | v_openx_node0005 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.440029-05 | v_openx_node0007 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439886-05 | v_openx_node0007 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439718-05 | v_openx_node0003 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439687-05 | v_openx_node0003 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.43919-05 | v_openx_node0008 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.439161-05 | v_openx_node0008 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.437716-05 | v_openx_node0002 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.437698-05 | v_openx_node0002 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_COMPLETED 2020-09-25 05:20:14.362344-05 | v_openx_node0004 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.362302-05 | v_openx_node0001 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.362216-05 | v_openx_node0001 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.36221-05 | v_openx_node0004 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361829-05 | v_openx_node0006 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361802-05 | v_openx_node0003 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361782-05 | v_openx_node0007 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.36178-05 | v_openx_node0005 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361763-05 | v_openx_node0008 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361717-05 | v_openx_node0006 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.3617-05 | v_openx_node0003 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361683-05 | v_openx_node0008 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361682-05 | v_openx_node0007 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.361673-05 | v_openx_node0005 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.360517-05 | v_openx_node0002 | test1_b1 | RECOMPUTE_LIMITS | REQUEST_QUEUED 2020-09-25 05:20:14.360433-05 | v_openx_node0002 | test1_b0 | RECOMPUTE_LIMITS | REQUEST_QUEUED (34 rows)And column size changed
Thank you for sharing the detailed information. I guess Mergeout you ran after changing the column encoding didn't need to merge the ROS containers. Can you check the number of ROS files when you run Mergeout?
[Original status for each column on each node] projection_name | column_name | encodings | compressions | node_name | ros_count | size_Mb -----------------+-------------+--------------+--------------+------------------+-----------+--------- test1_b0 | epoch | Int_Delta | none | v_trial_node0001 | 38 | 1.5 test1_b0 | epoch | Int_Delta | none | v_trial_node0002 | 38 | 1.5 test1_b0 | epoch | Int_Delta | none | v_trial_node0003 | 38 | 1.5 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0001 | 38 | 3.2 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0002 | 38 | 3.1 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0003 | 38 | 3.2 test1_b1 | epoch | Int_Delta | none | v_trial_node0001 | 38 | 1.5 test1_b1 | epoch | Int_Delta | none | v_trial_node0002 | 38 | 1.5 test1_b1 | epoch | Int_Delta | none | v_trial_node0003 | 38 | 1.5 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0001 | 38 | 3.2 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0002 | 38 | 3.2 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0003 | 38 | 3.1 => SELECT do_tm_task('mergeout', 'test.test1'); [The status after Mergeout without changing the column encoding] projection_name | column_name | encodings | compressions | node_name | ros_count | size_Mb -----------------+-------------+--------------+--------------+------------------+-----------+--------- test1_b0 | epoch | Int_Delta | none | v_trial_node0001 | 1 | 2.8 test1_b0 | epoch | Int_Delta | none | v_trial_node0002 | 1 | 2.8 test1_b0 | epoch | Int_Delta | none | v_trial_node0003 | 1 | 2.8 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0001 | 1 | 3.2 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0002 | 1 | 3.1 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0003 | 1 | 3.2 test1_b1 | epoch | Int_Delta | none | v_trial_node0001 | 1 | 2.8 test1_b1 | epoch | Int_Delta | none | v_trial_node0002 | 1 | 2.8 test1_b1 | epoch | Int_Delta | none | v_trial_node0003 | 1 | 2.8 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0001 | 1 | 3.2 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0002 | 1 | 3.2 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0003 | 1 | 3.1 => ALTER TABLE test.test1 ALTER COLUMN language_id ENCODING BLOCK_DICT PROJECTIONS (test.test1_b0,test.test1_b1); => SELECT do_tm_task('mergeout', 'test.test1'); [The status after Mergeout after changing the column encoding] projection_name | column_name | encodings | compressions | node_name | ros_count | size_Mb -----------------+-------------+--------------+--------------+------------------+-----------+--------- test1_b0 | epoch | Int_Delta | none | v_trial_node0001 | 1 | 2.8 test1_b0 | epoch | Int_Delta | none | v_trial_node0002 | 1 | 2.8 test1_b0 | epoch | Int_Delta | none | v_trial_node0003 | 1 | 2.8 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0001 | 1 | 3.2 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0002 | 1 | 3.1 test1_b0 | language_id | Uncompressed | int delta | v_trial_node0003 | 1 | 3.2 test1_b1 | epoch | Int_Delta | none | v_trial_node0001 | 1 | 2.8 test1_b1 | epoch | Int_Delta | none | v_trial_node0002 | 1 | 2.8 test1_b1 | epoch | Int_Delta | none | v_trial_node0003 | 1 | 2.8 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0001 | 1 | 3.2 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0002 | 1 | 3.2 test1_b1 | language_id | Uncompressed | int delta | v_trial_node0003 | 1 | 3.1 => INSERT INTO test.test1 SELECT * FROM test.test1; => COMMIT; => SELECT do_tm_task('mergeout', 'test.test1'); [The status after Mergeout after loading some data into projections] projection_name | column_name | encodings | compressions | node_name | ros_count | size_Mb -----------------+-------------+------------+--------------+------------------+-----------+--------- test1_b0 | epoch | Int_Delta | none | v_trial_node0001 | 1 | 5.7 test1_b0 | epoch | Int_Delta | none | v_trial_node0002 | 1 | 5.5 test1_b0 | epoch | Int_Delta | none | v_trial_node0003 | 1 | 5.6 test1_b0 | language_id | Block_Dict | none | v_trial_node0001 | 1 | 5.1 test1_b0 | language_id | Block_Dict | none | v_trial_node0002 | 1 | 5 test1_b0 | language_id | Block_Dict | none | v_trial_node0003 | 1 | 5.1 test1_b1 | epoch | Int_Delta | none | v_trial_node0001 | 1 | 5.6 test1_b1 | epoch | Int_Delta | none | v_trial_node0002 | 1 | 5.7 test1_b1 | epoch | Int_Delta | none | v_trial_node0003 | 1 | 5.5 test1_b1 | language_id | Block_Dict | none | v_trial_node0001 | 1 | 5.1 test1_b1 | language_id | Block_Dict | none | v_trial_node0002 | 1 | 5.1 test1_b1 | language_id | Block_Dict | none | v_trial_node0003 | 1 | 5