COPY statement generates "ERROR 4704: Sequence exceeded max value" message

Issuing a COPY statement from a Vertica cluster node results in an error
ERROR 4704:  Sequence exceeded max value
I can't find any information on the cause of this  error or troubleshooting steps.

The COPY statement looks like this:
copy benchmarks.fake_requests ( ip, url, ua ) from  '/tmp/data/???.gz' gzip 
ENCLOSED BY E'"'  
delimiter E'\t' 
direct ; 
/tmp/data contains tab delimited files like xaa.gz, xab.gz, etc.

the data looks like

"46.62.215.*" "http://www.west.com/"; "Mozilla/5.0 (Windows NT 6.1; sl-SI; rv:1.9.1.20) Gecko/2010-08-29 03:33:05 Firefox/3.8"

"240.212.193.*" "http://www.paucek.com/"; "Mozilla/5.0 (compatible; MSIE 6.0; Windows 98; Trident/3.1)"

"62.14.30.*" "http://crist-hoeger.biz/"; "Mozilla/5.0 (compatible; MSIE 5.0; Windows NT 5.0; Trident/5.0)"

"86.204.99.*" "http://abshire.com/"; "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 5.2; Trident/4.0)"

The table is created like this:
create schema if not exists benchmarks;    set search_path=benchmarks,public;    drop table if exists fake_requests;    drop sequence if exists fake_requests_seq;    create sequence fake_requests_seq cache 250000;    create table fake_requests (   id int not null default nextval('fake_requests_seq'),   ip varchar(15) not null,   url varchar(256) not null,   ua varchar(1024) null,     primary key (id)  ) segmented by modularhash(id) all nodes;
Any idea why I am seeing this error?    I don't see the error when using COPY FROM LOCAL option, only when I copy from the node.   Also, I see the error whether I try to load multiple files by wildcard or a single file using an explicit file name.

Thanks
David

Comments

  • marcothesanemarcothesane - Select Field - Administrator

    Hi David -

    Your table has an ID column populated by default by a sequence
    (id int not null default nextval('fake_requests_seq')  ).

    It's your fake_requests_seq sequence that overruns.

    Have you dropped the table several times without dropping /re-initializing the sequence? Or do you indeed have such a humongous data file with so many rows that the sequence overruns from scratch?

    If your table is not empty, then run:

    select max(id) from fake_requests;

    Then, run:

    select fake_requests_seq.nextval;

    ... and decide by yourself how many numbers the sequence has left for you ...

    Come to think of it: it also depends on how many ???..gz files you have. I think I remember that the COPY command would run several parallel sessions, one per file. And your sequence cache is at 250,000. Each thread that is instantiated for a *gz. file will begin the sequence at another multiple of 250,000.

    Good luck -

    marco

  • Navin_CNavin_C Vertica Customer
    Hi David,

    Its a clear case of your sequence running over the maximum limit for the cluster.
    You can refer to these posts here to have more understanding on using sequences in Vertica.

    Using Sequences

    Hope this helps.
    NC
  • Thanks for your help / insights.   It makes sense what you're saying, however, what I am seeing doesn't quite agree.
    • I am dropping and creating the table and the sequence on each attempt.
    • I am using the default sequence cache setting of 250K.   However, I see the same error even when using a sequence cache setting of 10K.
    • There are 5 files with 100K rows each for a total of 500K rows.
    • Even trying to load just one 100K row file causes this error.
    If I do a \ds in vsql, I get this output, so no indication that we're even close to the Maximum value for the sequence.
    dbadmin=> \ds fake_requests_seq 

                                                     List of Sequences

       Schema   |     Sequence      | CurrentValue | IncrementBy | Minimum |       Maximum       | AllowCycle | Comment 

    ------------+-------------------+--------------+-------------+---------+---------------------+------------+---------

     benchmarks | fake_requests_seq |        10000 |           1 |       1 | 9223372036854775807 | f          |  
    If I look at nextval after the error, also not even close.   

    dbadmin=> select fake_requests_seq.nextval;

     nextval 

    ---------

        4346

    I can provide copies of the code I am using to reproduce this if you're interested.


    David

  • Thanks for your response.  I found your two articles helpful, but am still not sure where I am going wrong.

    Given that I am using the default of 250000 for the sequence cache setting, and that I am loading only a modest quantity of data: 500,000 rows using 5 x 100,000 row files (on a single node cluster), even if one batch of sequence numbers is allocated per file per host, it doesn't make sense that I am bumping up against the maximum.

    So, two questions:

    1) Am I misunderstanding the calculation of sequence number allocation?
    2) What the expected use case or best practice should be with regard to use of sequences and COPY.   

    Thanks much
    David
  • I would like to mention that we are also seeing the same error during our COPY statements, but sporadically.  We too are loading a modest amount of data, and we have queried the sequence max value following the error and determined that was only at 250,000 or 500,000.
  • Hi!

    To everyone who see this error please check Global Sequence value in catalog.
    If it is too big, run:
    select clear_all_sequence_cache();

    PS
    (Im not sure for solution, but it helped to me)

    I did a clear install, created a new db and found in my catalog:
    %> pwd
    /vertica/dev/v_dev_node0001_catalog/Catalog
    %> ls
    Checkpoints config.cat globalsequence.cat tiered_catalog.cat Txnlogs
    %> cat globalsequence.cat
    :GlobalSequence
    nodeID:45035996273704960
    tagLength:12
    fence:52
    nodeIDMask:18442240474082181120
    ceiling:4503599627370495
    currentSequenceValue:19405
    After clear (clear_all_sequences_cache):
    %> cat sequencegenerator.cat 
    :GlobalSequence
    nodeID:45035996273704960
    tagLength:12
    fence:52
    nodeIDMask:18442240474082181120
    ceiling:4503599627370495
    currentSequenceValue:295
    .
    And no more fails with error: Sequence exceeded max value
  • There is a known bug in recent versions of Vertica that could be affecting you.  (Ask your friendly support rep about VER-35620.)

    If you turn off parallelism, it will probably go away.
  • Hi David,
    Thanks for reporting the problem. The bug is resolved in 7.1.1-3.
    Here is a link to the release notes:
    http://my.vertica.com/docs/ReleaseNotes/7.1.x/HP_Vertica_7.1.x_Release_Notes.htm

    - Shivani

  • I have same error as well. How can I install the version 7.1.1-4 ? I could not find it in the myVertica/Downloads? 

    Thanks for help.
  • Hi!

    Look, even not all drivers available to CE users and you want a last fix/patch? :))

    Forget about it, its for EE customers only or wait until it will be available to CE users (at least about a half year delay between last CE and EE versions). If you are CE user, so keep going to get an error. CE users not important for HP.
  • You can reach out to support to try and get the latest patch. While in the past there has been a larger gap between the EE & CE availability, CE users can download 7.1.1 (that's a 2 month gap, not 6 months).
  • Hi!
    that's a 2 month gap, not 6 months
    Oh, no, its only two month!  Im ready for this sort of trade off - every time when Vertica founds a bug and posts a fix I will wait for fix at least 2 months.

    PS
    2 months or half year... doesn't matter... project is active now.
    I don't think that customer/user will be happy  with  "wait 2 month" (and still its not 100%, may be a close to half year too).
  • Well, that's certainly your opinion. I know of customers that have restrictions on updating freshly released patches. As I mentioned earlier, customers can reach out to support to try to get the latest patch.

    The same version is available for download for both CE & EE users at my.vertica.com.
  • :))))
    that have restrictions on updating freshly released patches
    Not in this case, user explicitly asks for patch.
    As I mentioned earlier, customers can reach out to support to try to get the latest patch.
    And 99% will get an answer (precedents have been)  - use in what you can download. I can cite some employees like Adam Seering and others.


    And really, I can't understand it - if I can ask for patch from Support and friendly support will give me a patch, so why do not put in public? Why I need to ask? That Support will feel self-importance?

    You are not so smart.
  • Norbert Krupa
    Im fighting for community in all ways(and don't care about my image) - that community will get a full support, last drivers, last patches, will not be ignored, etc. And what you do? Are you part of community? Or you are a spy?

    By defending a poor support you are actually against a community.
  • How can I "turn off" parallelism just for a single session?

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file