Options

Change datatype from identity to identity(1,1,1)

NitheshNithesh - Select Field -

Hi, 

we have created vertica tables with ID column having datatype Identity. The issue is everytime a session is closed or table truncated the ID value jumps to 250000, 500000 so  on. I understand that if we change the cache to 1, this wouldnt be an issue. 

So my question is can we alter the column to change datatype from identity to identity(1,1,1) ?

 

Comments

  • Options

     

    UPDATE: I end up writing blog post on this topic.

                    Portion of it the same as my answers in this topic, but it has some additional content to.

                    http://www.dbjungle.com/working-with-identity-columns-in-hp-vertica/

     

     

     

     

     

    Default values for IDENTITY is 1,1,250000 so 

     

    CREATE TABLE table_test(identity_column IDENTITY, data_column INT);

     

    will be same as 

     

    CREATE TABLE table_test(identity_column IDENTITY(1,1,250000), data_column INT);

    So to make it IDENTITY(1,1,1) all you need to change is last value, which is cache from 250000 to 1.

    Identity column in Vertica use sequence under hood so all you need to do is find apropriate sequence and alter it

    alter sequence public.table_test_identity_column_seq CACHE 1;

    This will have effect only on newly inserted data. Making this change will have no effect on data in the table.You cannot change the value of an IDENTITY column once data inserted into the table. If you insert more data it will continue from highest number and will be incrementing according to current sequence properties.

    If you want clean sequential numbers in existing data of  IDENTITY column you will need to recreate table and re-load data.

    Here is example:

     

    $ vsql
    Password:
    dbadmin=> CREATE TABLE table_test(identity_column IDENTITY, data_column INT);
    CREATE TABLE
    dbadmin=> insert into table_test (data_column) values (1);
    dbadmin=> insert into table_test (data_column) values (2);
    dbadmin=> commit;
    COMMIT
    dbadmin=> \q

     

    $ vsql
    Password:
    dbadmin=> insert into table_test (data_column) values (3);
    dbadmin=> insert into table_test (data_column) values (4);
    dbadmin=> insert into table_test (data_column) values (5);
    dbadmin=> commit;
    COMMIT
    dbadmin=> select * from table_test order by identity_column;
    identity_column | data_column
    -----------------+-------------
                   1 | 1
                   2 | 2
              250001 | 3
              250002 | 4
              250003 | 5
    (5 rows)

    dbadmin=> select sequence_schema,sequence_name,minimum,increment_by,session_cache_count from sequences where identity_table_name = 'table_test';

     

    sequence_schema | sequence_name | minimum | increment_by | session_cache_count
    -----------------+--------------------------------+---------+--------------+---------------------
    public | table_test_identity_column_seq | 1 | 1 | 250000

     

    dbadmin=> alter sequence public.table_test_identity_column_seq CACHE 1;
    ALTER SEQUENCE
    dbadmin=> select sequence_schema,sequence_name,minimum,increment_by,session_cache_count from sequences where identity_table_name = 'table_test';

     

    sequence_schema | sequence_name | minimum | increment_by | session_cache_count
    -----------------+--------------------------------+---------+--------------+---------------------
    public | table_test_identity_column_seq | 1 | 1 | 1

     

    dbadmin=> \q

     

    $ vsql
    Password:
    dbadmin=> insert into table_test (data_column) values (6);
    dbadmin=> insert into table_test (data_column) values (7);
    dbadmin=> commit;
    COMMIT
    dbadmin=> \q

     

    $ vsql
    Password:
    dbadmin=> insert into table_test (data_column) values (8);
    dbadmin=> insert into table_test (data_column) values (9);
    dbadmin=> insert into table_test (data_column) values (10);
    dbadmin=> commit;
    COMMIT
    dbadmin=> select * from table_test order by identity_column;
    identity_column | data_column
    -----------------+-------------
                   1 | 1
                   2 | 2
              250001 | 3
              250002 | 4
              250003 | 5
              500001 | 6
              500002 | 7
              500003 | 8
              500004 | 9
              500005 | 10
    (10 rows)

    dbadmin=>

  • Options
    NitheshNithesh - Select Field -

    Thank you for your answer K_Krutiy.

     

    I did alter the sequence to change the cache, but later had requirement of reseting the ID value to 1 whenever the table got truncated. For the moment i have replaced the Identity with Sequences and use :

    ALTER SEQUENCE seq_name RESTART WITH 1;

     

    Can we do something similar with IDENTITY columns? I didnt find any way to reseed/reset the IDENTITY column.

  • Options

    Be careful using an increment value of 1.  On a small table with small loads it's ok.  But if you do this on a larger table with large loads, this would dramatically slow down the loads and cause a lot of contention on GlobalCatalog locks.

     

      --Sharon

     

  • Options

     

    You can do same thing to table with identity

     

    1) truncate table

    2) alter underlying sequence

    3) insert new data

     

    dbadmin=> truncate table table_test;
    TRUNCATE TABLE
    dbadmin=> alter sequence public.table_test_identity_column_seq RESTART WITH 1;
    ALTER SEQUENCE
    dbadmin=> insert into table_test (data_column) values (11);
    dbadmin=> commit;
    COMMIT
    dbadmin=> select * from table_test order by identity_column;

    identity_column | data_column
    -----------------+-------------
                   1 | 11
    dbadmin=>

     

    Sharon brought very good point. Default value for cache set to 250000 for performance reasons.

     

    You need to find good balance between perfectly sequenced numbers in identity columns and performance.

    Data loading performance will suffer when cache is set to 1.

    In addition if you will be loading a lot of data you can see some impact to query performance too

  • Options

     

    I end up writing blog post on this topic.

     

    Portion of it same as my answers above, but it has some additional content to.

    Plus it is better structired.

     

    I think it will be beneficial to read it 

     

    http://www.dbjungle.com/working-with-identity-columns-in-hp-vertica/

  • Options
    NitheshNithesh - Select Field -

    Thank you for your help and time. For the moment i think i need to balance between the cache size and performance hit. This might not be a suitable use case for a vertica database.

    It would have been good if Vertica too had RESEED option like in SQL Server, or could simply start the AutoIncrement column from 1 on table truncation like in MySQL.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file