Improvement to running queries on historic data

Hi, I want to be able to query a table at a point in time. There maybe transactions or reloads that might have happened after my time of interest. I have concluded that the below are my options: 1) I can create a second table (or set of tables) and save the contents of the table(s) on which a snapshot should be taken. This is the only option that serves the purpose however it is not scalable after a certain point. 2) I can run historic queries by manipulating the HISTORY_RETENTION_* parameters. However TRUNCATE gets rid of all the historic data. 3) Database snapshots can be created but it is meant for backups and cannot be queried upon. I request a feature in future version, where a snapshot (or epoch) of a table(s), schema(s) or Database can be taken and historic queries be able to run off of it. Thank you. Zac.

Comments

  • Raul,

    Thank you for the link. What you pointed out is mentioned as 2). ie run historic queries by manipulating the HISTORY_RETENTION_* parameters. The problem however is that a TRUNCATE statement gets rid of all the historic data about a table.

    Zac. 
  • Hi Zac,

    Yes, most versioning systems (database and otherwise) have a mechanism such as TRUNCATE whereby you can delete all data, including history.  This is a commonly-requested feature of such systems; it's typically seen as valuable for performance and data-retention reasons.

    There's generally also a way to remove all data as of the current revision without dropping history.  In Vertica, this can be accomplished with the command "DELETE FROM <table>;".  This command is currently slow on large data sets; that is because it's noting the deletion in the history of the records that you're deleting, which is exactly the functionality that you're looking for.  If your concern is DELETE performance in the case of dropping the whole table, that would I think be a reasonable Idea to ask for, if you would use it.

    Are you requesting that we build a versioning system where it is not possible to drop data ever, under any circumstances?  Would you be satisfied by disallowing TRUNCATE so that your users can't throw away history?

    Adam
  • Adam,

    What I want this as follows:

    If I know that I will need to get data as of a date, I should be able to do it easily from Vertica. Currently this is not easily doable with Vertica. What I am asking is an Oracle snapshot like feature.

    If disallowing Truncate to get rid of historic data is the best way Vertica can implement an 'Oracle snapshot' equivalent feature then that will work.

    Zac.


Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file