projection design consideration for load high volume of data in a cluster

edited July 2017 in General Discussion

we have a system that load high volume of data every minute using Kafka and Vertica
what is the best practices and important consideration for designing projection for this case ?

Comments

  • Hi!

    From my experience:

    • proper denormalization (not every denormalization is good)
    • manually created projections that gives to your queries required performance (use DBD only for encoding only)

    PS
    Sorry, "silver bullet" not exists(imho).

  • edited July 2017

    The general rule is to stand up your cluster, load a good amount of data, and then run the DBD on the entire dataset along with a representative set of queries. You can then edit the DDL as you see fit or just accept the entire recommendation. Projection design is an iterative process; it's something you'll want to re-visit over time as more data, more users, and new queries are added to your database.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file