projection design consideration for load high volume of data in a cluster

hoseiney · July 2017

we have a system that load high volume of data every minute using Kafka and Vertica
what is the best practices and important consideration for designing projection for this case ?

sKwa · July 2017

Hi!

From my experience:

proper denormalization (not every denormalization is good)
manually created projections that gives to your queries required performance (use DBD only for encoding only)

PS
Sorry, "silver bullet" not exists(imho).

TomM · July 2017

The general rule is to stand up your cluster, load a good amount of data, and then run the DBD on the entire dataset along with a representative set of queries. You can then edit the DDL as you see fit or just accept the entire recommendation. Projection design is an iterative process; it's something you'll want to re-visit over time as more data, more users, and new queries are added to your database.

We're Moving!

Create My New Community Account Now

projection design consideration for load high volume of data in a cluster

Comments

Leave a Comment