Please take this survey to help us learn more about how you use third party tools. Your input is greatly appreciated!

projection design consideration for load high volume of data in a cluster

edited July 2017 in General Discussion

we have a system that load high volume of data every minute using Kafka and Vertica
what is the best practices and important consideration for designing projection for this case ?


  • Hi!

    From my experience:

    • proper denormalization (not every denormalization is good)
    • manually created projections that gives to your queries required performance (use DBD only for encoding only)

    Sorry, "silver bullet" not exists(imho).

  • edited July 2017

    The general rule is to stand up your cluster, load a good amount of data, and then run the DBD on the entire dataset along with a representative set of queries. You can then edit the DDL as you see fit or just accept the entire recommendation. Projection design is an iterative process; it's something you'll want to re-visit over time as more data, more users, and new queries are added to your database.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file

Can't find what you're looking for? Search the Vertica Documentation, Knowledge Base, or Blog for more information.