ERROR: Too many ROS containers exist for the following projections

MrBudy01 · April 2023

Hi!

For about 1 week I keep getting this error:
SQL Error [5065] [53000]: [Vertica]VJDBC ERROR: Too many ROS containers exist for the following projections:
tr.test_id (limit = 176128, ROS files = 176042, DV files = 20, new files = 86)
[Vertica]VJDBC ERROR: Too many ROS containers exist for the following projections:
tr.test_id (limit = 176128, ROS files = 176042, DV files = 20, new files = 86)
[Vertica]VJDBC ERROR: Too many ROS containers exist for the following projections:
tr.test_id (limit = 176128, ROS files = 176042, DV files = 20, new files = 86) when I try to use the DBeaver option of import from csv or any other form(export from another oracle db and importing in).
I don't know what changed but I followed this post https://forum.vertica.com/discussion/235201/vjdbc-5065-error-too-many-ros-containers-exist-for-the-following-projections and still couldn't fix it. When I import to another table it works just fine.

Any help will be appreciated.

Thank you for your time!

VValdar · April 2023

You don't need access to the nodes to perform a COPY, you can perform a COPY LOCAL.
The best solution probably!

VValdar · April 2023

Hi MrBudy01,

ROS files = 176042 that's way too high.
Maximum by default is 1024 and we advise to keep this number bellow 50.

Seems the inserts are running one by one.

MrBudy01 · April 2023

Hi Valdar,

Sorry if this question sounds dumb but how should I keep them under 50?

On another note how can I delete some of those ROS files for this table?

In DBveaver I don't see the option to insert them one by one or bulk.

Thank you for your answer!

VValdar · April 2023

You have to run mergeout on your tables / projections.

In DBeaver when loading data there is two options:
- Do-commit after row insert
- Use multi-row value insert

This last option requires your database to be at least in 11.1.

Those are not a recommandation, this is by default in my DBeaver interface (and for whatever reason some lines are in french)

My DBeaver is currently 22.3.3

mosheg · April 2023

In addition, try to use one COPY statement to load data from files,
By loading all files from all nodes using only ONE single COPY command on one of the nodes.
And try to increase each load dose.
You can maintain load throughput by increasing each input file size, up to an optimal size (rule of thumb 10-20 GB each).
See: https://docs.vertica.com/12.0.x/en/sql-reference/statements/copy

MrBudy01 · April 2023

Hi! Thank you guys for your suggestions!

@Valdar my version of Vertica is 10.1.1 so I don't think I can use multi-row value insert.
Without that my interface looks the same as yours.
Regarding the mergeout, I used select DO_TM_TASK ('mergeout', 'xyz') on every projection that the tables uses and on the table itself and still it's importing the data very slowly and getting the same error.

@mosheg thank you for the response but I can't use the copy command because I don't have access to the nodes directly, only through dbeaver.

marcothesane · April 2023

Hi MrBudy01

Yes, you can perform multi-row value insert as in specified in DBeaver.
If you perform a multi-row value insert into a table mystuff, DBeaver will create a statement: INSERT INTO mystuff VALUES(?,?,?) (as many question marks as mystuff has columns).
Then, it will read rows into memory, and insert all these rows in one SQL call into Vertica, read the next row set, rinse, repeat.
Vertica has been able to support that since the beginning. I used in Version 4 for the first time. Each and every one of these SQL calls create a new ROS container!
So, use multi-row insert, make the row buffer as big as your DBeaver client and the memory of your laptop allows you to, and, if you can, commit at the end, not every 10,000 rows.

MrBudy01 · April 2023

Hi @marcothesane,

Well it didn't work even with multi-row insert. It's still slow as hell and I'm getting the same error but I'm getting the felling it's about the resources allocated for the nodes. Do you any suggestion verifying this without accessing the node machines, only from the dbeaver interface?

marcothesane · April 2023

First thing that comes to mind is the network. How long does a ping take? Like in here?

marco ~/1/c $ ping zbook
PING marcos-ubuntu (192.168.1.114): 56 data bytes
64 bytes from 192.168.1.114: icmp_seq=0 ttl=64 time=210.856 ms
64 bytes from 192.168.1.114: icmp_seq=1 ttl=64 time=12.268 ms
64 bytes from 192.168.1.114: icmp_seq=2 ttl=64 time=88.663 ms

Each time you issue a call with the insert of a row set, at least the loopback time - that ping can measure - is spent. Then, of course, time is spent to send the data of that row set over the network. Only then will the Vertica server pick up the data, distribute it to the right nodes depending on the segmentation criteria of the table, sort and encode the data, and finally write to disk.

Sometimes, divide and conquer helps. Split the file into manageable chunks, and load each individually.

MrBudy01 · April 2023

@marcothesane this is my ping to the vertica server:
Reply from xyz: bytes=32 time=3ms TTL=62
Reply from xyz: bytes=32 time=3ms TTL=62
Reply from xyz: bytes=32 time=2ms TTL=62
Reply from xyz: bytes=32 time=3ms TTL=62

MrBudy01 · April 2023

@Valdar That worked perfectly!
Thank you!

mosheg · April 2023

Here is an example: COPY my_table_name FROM local 'c:\my_file_path\my_data.csv' PARSER fcsvparser() ABORT ON ERROR;

We're Moving!

Create My New Community Account Now

ERROR: Too many ROS containers exist for the following projections

Best Answer

Answers

Leave a Comment