execution time of updates and projections
Hallo, I am Katrin and developer for 24/7 media. During the last few weeks, we had a few problems with starting and stopping the database. Your support helped us. Thank's a lot. After solving the issues, your support recommended: "Delete concerns – Even I am not much aware of it but below was told to me while we had recovery taking longer. 8/29/13 12:36 AM Shaun Carney: I think I may have been mistaken - sort order is identical but still delete concerns - that means delete/update statements are using a column in the WHERE clause that isn't in the sort order of either projection then" So I changed the projection for one of our tables, which is updated very often. But now, the update on this table takes more than 3 times longer as before. What can we do to get a better update performance? Here are the details: We have two tables: pts_raw_data_click_detail and pts_raw_data_delivery_detail. Their counts are: pts_raw_data_click_detail: overall count: 47,393,844 daily count: ca. 150,000 pts_raw_data_delivery_detail: overall count: 28,637,325,768 daily count: ca. 80,000,000 Every hour we update a column in pts_raw_data_delivery_detail with counts from pts_raw_data_click_detail. I will attach the ddl, explain of the update statement and the former projections, with which the update performed better. Please find a solution. Thank you. Best regards Katrin
0
Comments
Recovery is the action required for a cluster to restore K-safety after a crash. Large numbers of deleted records can degrade the performance of a recovery. To improve recovery performance,we nee to purge the deleted rows.
In HP Vertica, delete operations do not remove rows from physical storage. Unlike most databases, the DELETE command in HP Vertica marks rows as deletedso that they remain available to historical queries. These deleted rows are called historical data. Retention of historical data also applies to the UPDATE command, which is actually a combined DELETE and INSERT operation.
The cost of retaining deleted data in physical storage can be measured in terms of:
----------------------------------------------------------------------------------------------------------------
-Disk space for the deleted rows and delete markers-A performance penalty for reading and skipping over deleted data.
=> So after every heavy delete & update(update is delete + insert in vertica), it s recommended to purge the data also.
1.
To check for a high number of delete vectors you can use the following query:----------------------------------------------------------------------------
select node_name,projection_name,storage_oid,count(*) from delete_vectors group by node_name,projection_name,storage_oid order by 4 desc limit 20;
2.
Advanced the AHM and purged the tables with hundreds of delete vector containers:
select make_ahm_now(); --or select make_ahm_now(true); -- this allows the ahm to advance when a node is down
select purge_table('schemaname.tablename'); ... (and other tables)
=>You can optimize youe delete/update statments by checking this documetation link below:
https://my.vertica.com/docs/6.1.x/HTML/index.htm#12675.htm