Options

Tips for cleaning up flex table

Hi,

 

I have an hybrid flex table which contains event data. Each event has a type a date/timestamp (materialized) and some properties (flex).

 

Currently the table is partitioned as date_trunc('month',date), and the projection is sorted on type and date to make sure event types can be retrieved efficiently.

 

The table had ±1 billion rows. We just loaded an extra event type (another 1 billion rows) from our source system (ElasticSearch) in the same table, but somehow during the loading it failed due a time out on ES side, and I don't have specific offset (e.g. a day) that I can use. Now I have to clean this mess up by fully removing this specific event, but I was wondering how I could do this efficiently (ideally without the need of repopulating the other events from scratch).

 

Deleting is a no-go, and I cannot duplicate the table and re-insert without that specific event because inserts are not supported with flex tables. I was thinking of repartitioning the table based on event type and then dropping the partition of that specific event, and then switching back to the original partitioning per month.

 

Does anyone maybe know another good strategy to do this? All suggestions are welcome!

 

Thanks in advance!

 

Best,

Derek

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file