Change datatype from identity to identity(1,1,1)
Nithesh
- Select Field - ✭
Hi,
we have created vertica tables with ID column having datatype Identity. The issue is everytime a session is closed or table truncated the ID value jumps to 250000, 500000 so on. I understand that if we change the cache to 1, this wouldnt be an issue.
So my question is can we alter the column to change datatype from identity to identity(1,1,1) ?
0
Comments
UPDATE: I end up writing blog post on this topic.
Portion of it the same as my answers in this topic, but it has some additional content to.
http://www.dbjungle.com/working-with-identity-columns-in-hp-vertica/
Default values for IDENTITY is 1,1,250000 so
CREATE TABLE table_test(identity_column IDENTITY, data_column INT);
will be same as
CREATE TABLE table_test(identity_column IDENTITY(1,1,250000), data_column INT);
So to make it IDENTITY(1,1,1) all you need to change is last value, which is cache from 250000 to 1.
Identity column in Vertica use sequence under hood so all you need to do is find apropriate sequence and alter it
alter sequence public.table_test_identity_column_seq CACHE 1;
This will have effect only on newly inserted data. Making this change will have no effect on data in the table.You cannot change the value of an IDENTITY column once data inserted into the table. If you insert more data it will continue from highest number and will be incrementing according to current sequence properties.
If you want clean sequential numbers in existing data of IDENTITY column you will need to recreate table and re-load data.
Here is example:
$ vsql
Password:
dbadmin=> CREATE TABLE table_test(identity_column IDENTITY, data_column INT);
CREATE TABLE
dbadmin=> insert into table_test (data_column) values (1);
dbadmin=> insert into table_test (data_column) values (2);
dbadmin=> commit;
COMMIT
dbadmin=> \q
$ vsql
Password:
dbadmin=> insert into table_test (data_column) values (3);
dbadmin=> insert into table_test (data_column) values (4);
dbadmin=> insert into table_test (data_column) values (5);
dbadmin=> commit;
COMMIT
dbadmin=> select * from table_test order by identity_column;
identity_column | data_column
-----------------+-------------
1 | 1
2 | 2
250001 | 3
250002 | 4
250003 | 5
(5 rows)
dbadmin=> select sequence_schema,sequence_name,minimum,increment_by,session_cache_count from sequences where identity_table_name = 'table_test';
sequence_schema | sequence_name | minimum | increment_by | session_cache_count
-----------------+--------------------------------+---------+--------------+---------------------
public | table_test_identity_column_seq | 1 | 1 | 250000
dbadmin=> alter sequence public.table_test_identity_column_seq CACHE 1;
ALTER SEQUENCE
dbadmin=> select sequence_schema,sequence_name,minimum,increment_by,session_cache_count from sequences where identity_table_name = 'table_test';
sequence_schema | sequence_name | minimum | increment_by | session_cache_count
-----------------+--------------------------------+---------+--------------+---------------------
public | table_test_identity_column_seq | 1 | 1 | 1
dbadmin=> \q
$ vsql
Password:
dbadmin=> insert into table_test (data_column) values (6);
dbadmin=> insert into table_test (data_column) values (7);
dbadmin=> commit;
COMMIT
dbadmin=> \q
$ vsql
Password:
dbadmin=> insert into table_test (data_column) values (8);
dbadmin=> insert into table_test (data_column) values (9);
dbadmin=> insert into table_test (data_column) values (10);
dbadmin=> commit;
COMMIT
dbadmin=> select * from table_test order by identity_column;
identity_column | data_column
-----------------+-------------
1 | 1
2 | 2
250001 | 3
250002 | 4
250003 | 5
500001 | 6
500002 | 7
500003 | 8
500004 | 9
500005 | 10
(10 rows)
dbadmin=>
Thank you for your answer K_Krutiy.
I did alter the sequence to change the cache, but later had requirement of reseting the ID value to 1 whenever the table got truncated. For the moment i have replaced the Identity with Sequences and use :
ALTER SEQUENCE seq_name RESTART WITH 1;
Can we do something similar with IDENTITY columns? I didnt find any way to reseed/reset the IDENTITY column.
Be careful using an increment value of 1. On a small table with small loads it's ok. But if you do this on a larger table with large loads, this would dramatically slow down the loads and cause a lot of contention on GlobalCatalog locks.
--Sharon
You can do same thing to table with identity
1) truncate table
2) alter underlying sequence
3) insert new data
dbadmin=> truncate table table_test;
TRUNCATE TABLE
dbadmin=> alter sequence public.table_test_identity_column_seq RESTART WITH 1;
ALTER SEQUENCE
dbadmin=> insert into table_test (data_column) values (11);
dbadmin=> commit;
COMMIT
dbadmin=> select * from table_test order by identity_column;
identity_column | data_column
-----------------+-------------
1 | 11
dbadmin=>
Sharon brought very good point. Default value for cache set to 250000 for performance reasons.
You need to find good balance between perfectly sequenced numbers in identity columns and performance.
Data loading performance will suffer when cache is set to 1.
In addition if you will be loading a lot of data you can see some impact to query performance too
I end up writing blog post on this topic.
Portion of it same as my answers above, but it has some additional content to.
Plus it is better structired.
I think it will be beneficial to read it
http://www.dbjungle.com/working-with-identity-columns-in-hp-vertica/
Thank you for your help and time. For the moment i think i need to balance between the cache size and performance hit. This might not be a suitable use case for a vertica database.
It would have been good if Vertica too had RESEED option like in SQL Server, or could simply start the AutoIncrement column from 1 on table truncation like in MySQL.