The Vertica Forum is getting a makeover! The site will be undergoing maintenance from Tuesday 8/13 to Friday 8/16. We appreciate your patience and cooperation during this time.

Warning 3208 Threshold

When running an ETL job we are getting a warning such as : vsql:copy.sql:61: WARNING 3028: Delete performance of projection "Name Removed" is a concern. The squared number of rows scanned for each delete is recorded to be 109155 on average Is there any way to change this threshold so we dont get this white noise? Or is there a larger issue that i need to be aware of?

Comments

  • Hi! I think you have a lot of delete vectors. ---
    To eliminate rows that have been deleted from the result, a query must do extra processing. If 10% or more of the total rows in a table have been deleted, the performance of a query on the table degrades. However, your experience may vary depending on the size of the table, the table definition, and the query. If a table has a large number of deleted rows, consider purging those rows to improve performance. For more information on purging, see Purging Deleted Data.
  • Resolved this issue by increasing the following setting SELECT SET_CONFIG_PARAMETER('SlowDeleteSystemWarningLimit', 4000000);
  • Keep in mind that having a lot of deleted records causes a variety of major performance issues. For example, if you are getting this warning, if you have a node crash and you have to go through node recovery, you are unlikely to be able to go through recovery in a reasonable amount of time; you may be stuck throwing out data or restoring from a backup in order to get the cluster back up quickly. We strongly recommend that you try to understand why you have too many deleted records and that you reduce the number of delete vectors, rather than bumping the warning limit. For what it's worth, we recommend trying to implement your ETL processes such that you can use DROP PARTITION (or other techniques; for example, loading into a temporary table and using INSERT ... SELECT with a predicate to copy only the desired rows into the production table) rather than DELETE. Doing so generally leads to both improved ETL performance and improved query performance. You can also look at tuning tuple-mover parameters. (The Tuple Mover is the part of Vertica that cleans up delete vectors and actually removes the data in question from disk.)
  • Awesome thanks for the reply.
  • My understanding of this warning is that it's not a function of too many deleted records / lots of delete vectors, but rather that the delete process for the projection is inefficient. A projection that is well-optimized for deletes has adequate columns in the ORDER BY with RLE, and a high cardinality column at the end of the ORDER BY to make each set of distinct values in the ORDER BY relatively unique. I wouldn't blindly raise that config parameter to 4000000. Better to fix the projections. A high value there not only puts your cluster at risk of slower recovery as Adam pointed out, but it also results in more work during the replay delete phase of the tuple mover mergeouts. There's a section in the docs on optimizing for DELETE and UPDATE. --Sharon

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.