COPY statement generates "ERROR 4704: Sequence exceeded max value" message

David_Bernstein · October 2014

Issuing a COPY statement from a Vertica cluster node results in an error

ERROR 4704:  Sequence exceeded max value

I can't find any information on the cause of this error or troubleshooting steps.

The COPY statement looks like this:

copy benchmarks.fake_requests ( ip, url, ua ) from  '/tmp/data/???.gz' gzip 
ENCLOSED BY E'"'  
delimiter E'\t' 
direct ;

/tmp/data contains tab delimited files like xaa.gz, xab.gz, etc.

the data looks like

"46.62.215.*" "http://www.west.com/"; "Mozilla/5.0 (Windows NT 6.1; sl-SI; rv:1.9.1.20) Gecko/2010-08-29 03:33:05 Firefox/3.8"

"240.212.193.*" "http://www.paucek.com/"; "Mozilla/5.0 (compatible; MSIE 6.0; Windows 98; Trident/3.1)"

"62.14.30.*" "http://crist-hoeger.biz/"; "Mozilla/5.0 (compatible; MSIE 5.0; Windows NT 5.0; Trident/5.0)"

"86.204.99.*" "http://abshire.com/"; "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 5.2; Trident/4.0)"

The table is created like this:

create schema if not exists benchmarks;    set search_path=benchmarks,public;    drop table if exists fake_requests;    drop sequence if exists fake_requests_seq;    create sequence fake_requests_seq cache 250000;    create table fake_requests (   id int not null default nextval('fake_requests_seq'),   ip varchar(15) not null,   url varchar(256) not null,   ua varchar(1024) null,     primary key (id)  ) segmented by modularhash(id) all nodes;

Any idea why I am seeing this error? I don't see the error when using COPY FROM LOCAL option, only when I copy from the node. Also, I see the error whether I try to load multiple files by wildcard or a single file using an explicit file name.

Thanks
David

marcothesane · October 2014

Hi David -

Your table has an ID column populated by default by a sequence
(id int not null default nextval('fake_requests_seq') ).

It's your fake_requests_seq sequence that overruns.

Have you dropped the table several times without dropping /re-initializing the sequence? Or do you indeed have such a humongous data file with so many rows that the sequence overruns from scratch?

If your table is not empty, then run:

select max(id) from fake_requests;

Then, run:

select fake_requests_seq.nextval;

... and decide by yourself how many numbers the sequence has left for you ...

Come to think of it: it also depends on how many ???..gz files you have. I think I remember that the COPY command would run several parallel sessions, one per file. And your sequence cache is at 250,000. Each thread that is instantiated for a *gz. file will begin the sequence at another multiple of 250,000.

Good luck -

marco

Navin_C · October 2014

Hi David,

Its a clear case of your sequence running over the maximum limit for the cluster.
You can refer to these posts here to have more understanding on using sequences in Vertica.

Using Sequences

Hope this helps.
NC

David_Bernstein · October 2014

Thanks for your help / insights. It makes sense what you're saying, however, what I am seeing doesn't quite agree.

I am dropping and creating the table and the sequence on each attempt.
I am using the default sequence cache setting of 250K. However, I see the same error even when using a sequence cache setting of 10K.
There are 5 files with 100K rows each for a total of 500K rows.
Even trying to load just one 100K row file causes this error.

If I do a \ds in vsql, I get this output, so no indication that we're even close to the Maximum value for the sequence.

dbadmin=> \ds fake_requests_seq

List of Sequences

------------+-------------------+--------------+-------------+---------+---------------------+------------+---------

 benchmarks | fake_requests_seq |        10000 |           1 |       1 | 9223372036854775807 | f          |

If I look at nextval after the error, also not even close.

dbadmin=> select fake_requests_seq.nextval;

nextval

---------

4346

I can provide copies of the code I am using to reproduce this if you're interested.

David

David_Bernstein · October 2014

Thanks for your response. I found your two articles helpful, but am still not sure where I am going wrong.

Given that I am using the default of 250000 for the sequence cache setting, and that I am loading only a modest quantity of data&colon; 500,000 rows using 5 x 100,000 row files (on a single node cluster), even if one batch of sequence numbers is allocated per file per host, it doesn't make sense that I am bumping up against the maximum.

So, two questions:

1) Am I misunderstanding the calculation of sequence number allocation?
2) What the expected use case or best practice should be with regard to use of sequences and COPY.

Thanks much
David

Joseph_Leiba · December 2014

I would like to mention that we are also seeing the same error during our COPY statements, but sporadically. We too are loading a modest amount of data, and we have queried the sequence max value following the error and determined that was only at 250,000 or 500,000.

id10t · December 2014

Hi!

To everyone who see this error please check Global Sequence value in catalog.
If it is too big, run:

select clear_all_sequence_cache();

PS
(Im not sure for solution, but it helped to me)

I did a clear install, created a new db and found in my catalog:

%> pwd
/vertica/dev/v_dev_node0001_catalog/Catalog
%> ls
Checkpoints  config.cat  globalsequence.cat  tiered_catalog.cat  Txnlogs
%> cat globalsequence.cat 
:GlobalSequence
nodeID:45035996273704960
tagLength:12
fence:52
nodeIDMask:18442240474082181120
ceiling:4503599627370495
currentSequenceValue:19405

After clear (clear_all_sequences_cache):

%> cat sequencegenerator.cat 
:GlobalSequence
nodeID:45035996273704960
tagLength:12
fence:52
nodeIDMask:18442240474082181120
ceiling:4503599627370495
currentSequenceValue:295
.

And no more fails with error: Sequence exceeded max value

[Deleted User] · December 2014

There is a known bug in recent versions of Vertica that could be affecting you. (Ask your friendly support rep about VER-35620.)

If you turn off parallelism, it will probably go away.

[Deleted User] · December 2014

Hi David,
Thanks for reporting the problem. The bug is resolved in 7.1.1-3.
Here is a link to the release notes:
http://my.vertica.com/docs/ReleaseNotes/7.1.x/HP_Vertica_7.1.x_Release_Notes.htm

- Shivani

ozi_samur · December 2014

I have same error as well. How can I install the version 7.1.1-4 ? I could not find it in the myVertica/Downloads?

Thanks for help.

id10t · December 2014

Hi!

Look, even not all drivers available to CE users and you want a last fix/patch?

)

Forget about it, its for EE customers only or wait until it will be available to CE users (at least about a half year delay between last CE and EE versions). If you are CE user, so keep going to get an error. CE users not important for HP.

norbert_krupa · December 2014

You can reach out to support to try and get the latest patch. While in the past there has been a larger gap between the EE & CE availability, CE users can download 7.1.1 (that's a 2 month gap, not 6 months).

id10t · December 2014

Hi!

that's a 2 month gap, not 6 months

Oh, no, its only two month! Im ready for this sort of trade off - every time when Vertica founds a bug and posts a fix I will wait for fix at least 2 months.

PS
2 months or half year... doesn't matter... project is active now.
I don't think that customer/user will be happy with "wait 2 month" (and still its not 100%, may be a close to half year too).

norbert_krupa · December 2014

Well, that's certainly your opinion. I know of customers that have restrictions on updating freshly released patches. As I mentioned earlier, customers can reach out to support to try to get the latest patch.

The same version is available for download for both CE & EE users at my.vertica.com.

id10t · December 2014

)))

that have restrictions on updating freshly released patches

Not in this case, user explicitly asks for patch.

As I mentioned earlier, customers can reach out to support to try to get the latest patch.

And 99% will get an answer (precedents have been) - use in what you can download. I can cite some employees like Adam Seering and others.

And really, I can't understand it - if I can ask for patch from Support and friendly support will give me a patch, so why do not put in public? Why I need to ask? That Support will feel self-importance?

You are not so smart.

id10t · December 2014

Norbert Krupa
Im fighting for community in all ways(and don't care about my image) - that community will get a full support, last drivers, last patches, will not be ignored, etc. And what you do? Are you part of community? Or you are a spy?

By defending a poor support you are actually against a community.

ozi_samur · February 2015

How can I "turn off" parallelism just for a single session?

We're Moving!

Create My New Community Account Now

COPY statement generates "ERROR 4704: Sequence exceeded max value" message

Comments

Leave a Comment