When I insert my data to vertica
I have one question
I want to insert my data that is too long length
but I don't know it's real length
so,
a lot of data use just varchart(50) so
i made the space varchart(60)
but some word need more space. vertica return error but don't tell me the space...
how i catch my data's max length...
0
Comments
Hi choisika,
Using COPY will get you much better performance than doing bulk INSERTs. So I recommend you use COPY instead of INSERT. Using COPY will also allow you to get information about the rejected data.
For example, this COPY statement will write out the reason data is rejected (known as Exceptions) as well as the rejected data to two separate files:
This example loads a CSV file, but it can be modified for any other supported Vertica data source.
You could also load your data into Vertica with a column that is large enough to accept all of your data, and then use the MAX() function to figure out the maximum string length. Once you have the maximum string length you can create a new table or column and add the data, dropping the old column.
Let me know if this helps solve your problem.
-C
Ok , first you need to cleanup your data.
Do you need this empty space to be inserted into the database ?
No i don`t need to is just the file comes like this !
Ok fix this :
- remove all spaces(multiple space) and add only one.
See the lenght of your columns before inserting into the database:
- you can get only one columns and sort it by max val first :
- you can do this for all columns , this script can be incremented to fit all columns and order them, but you will have to do this .
For more info can you put in your copy command , so some of the experts will be able to help you
thx - hope this was helpful
Sorry, my real problem is another
"Too many ROS containers exist for the following projections" its my real problem
thank you for your kindness "Casey_s" and "Adrian_Oprea_1"
Have a nice day!!!
Are you using serial inserts ?
This will trigger to many ROS containers to come up .
Try to use the COPY command to insert data into your table.
Also try to run merge on the table with the ROS issue:
Find the tables with most ROS containers :
Hi Adrian_Oprea_1
Thank you!!
you are very very kind!!!!
I have one more question,,,
I carried my data to vertica use pentaho,,
then.. If I use mergeout code, where i write?
In pentaho there is an option See this link for more in detail http://wiki.pentaho.com/display/EAI/Vertica+Bulk+Loader
Also see --
Best Practices for Loading and Updating Data
To avoid creating too many or too few ROS containers, follow these recommendations:
• Load and delete data less frequently in larger batches.
• If you are working with many small files of raw data, use a single COPY statement to load these files at one time. Use the wildcard support to COPY from all files, or concatenate all files to a named pipe and have COPY load from the pipe.
• Use a single DELETE/UPDATE statement to delete or update batches of rows. When possible, delete or update batches of rows at a time to avoid creating too many delete vector ROS containers. There are many possible reasons as to why a load is not performing to expectations. Both Pentaho and HP Vertica provide a number of documents and best practices that can help you troubleshoot data load issues.
Also see Vertica Docs on TM management:
https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/AdministratorsGuide/TupleMover/TuningTheTupleMover.htm
-- this will help you understand better how yor TM is doing and what needs to be done.