Cannot allocate sufficient memory for COPY statement (536870912 requested, 268435456 permitted)
svijay
Community Edition User
What I have:
Vertica Community Edition on Ubuntu (Single node, 32 GB)
Hello,
I am Community Edition user of Vertica. I am loading the JSON file like:
_CREATE EXTERNAL TABLE customers(id INT, address VARCHAR, transactions ARRAY[INT]) AS COPY FROM 'cust.json' PARSER fjsonparser();_
The table is also being successfully copied with all the data with all of the 6 Million rows.
When I am running a simple select count(*)
statement, I am getting the above error (title). This error isn't occurring on a table with 200,000 rows or 800,000 rows.
I have also referred to this link regarding the same error:
Even when I increase the memory size of the unsuccessful query's resource pool, I am facing the same error.
Is there anything additional that needs to be ensured in order to have a successful run?
Any help will be appreciated.
Thanks
Tagged:
0
Answers
@svijay - Inetresting. What version of Vertica are you running? See if you can size the array column appropriately. By default, they are pretty big.
Example:
I can make it smaller by specifying the max number of expected elements in the array.
See: Syntax for Column Definition
BTW: When you create an external table via the CREATE EXTERNAL TABLE command, no data is loaded into Vertica. In fact, with external tables, data is never loaded into Vertica. It is always read from the source wherever that might might be
Thanks for your response Jim. These are some helpful tips. I am currently using Vertica_11.0.0-1
However, none of my fields are of the 'array' type, as that was just an example I had listed
My fields are more of the 'number', 'varchar' and 'string' type where I have been highlighting the sizes for each of them.
How would you recommend I load the JSON file into Vertica? I am currently using WinSCP to load the file into the VM and then using the copy command
Looking forward to hearing from you.
Thank you!
Hi,
Instead of an External Table you could use a Columnar (managed) Table, then load the data using the COPY command.
Example:
If you want to skip the WINSCP step, you can install vsql on your local system (i.e. where the JSON data files reside).
For example, I can load a JSON file from my Windows Laptop to my Vertica database running in the cloud. But this time I have to use COPY LOCAL to load the data.
Hey Jim,
I dropped the external table and created it again as:
But as you had mentioned, when I tried to copy it from the VM, I'm getting an error:
Thanks
Did you have a semi-colon after the CREATE TABLE statement?
Like this:
One more thing, try not use the data type NUMERIC(38,0) when you can use INT instead because an INT has a smaller footprint!
Hello, when you get the error, do you also see the following hint?
HINT: This can happen if you have very wide rows, or if you have specified an incorrect record terminator. (Check your escaping -- for example, "'\t'" is not a tab; you must use "E'\t'".) It can also happen if you are reading from a corrupted compressed file
As suggested in hint, you might want to double check your record terminators.
Thanks @Jim_Knicely and @SergeB . The missing semicolon did not create an issue.
The copy command also seems to be running now as it didn't before.
I am using the TPCH SF1 data and the tables are being created and having the data copied. However for tables > lineitem and > orders, the error mentions the title of this thread. It's not occurring for any of the other 6 tables.
Here is a link for your reference on TPCH data:
https://docs.snowflake.com/en/user-guide/sample-data-tpch.html
Since I have already tried the first comment in this thread (mine), is there anything else that needs to be done to remediate this?
Thanks
@svijay - Here are a few suggestions...
COPY CUSTOMERS FROM '/home/dbadmin/customers.json' PARSER fjsonparser(record_terminator=E'\n');
Make sure to actually use the record terminator that is in the files.
Split the JSON files into smaller files and load those.
Fiddle with the UDLMaxDataBufferSize parameter.
The default might be a bit low for your needs. It is recommended to only change its value under support supervision, but you can make changes at the session level as to not affect the entire DB.
Example:
@svijay Could you share a snippet (first 3-4 records) of your JSON file?