Is Vertica Clustering Increase Performance?
Hi, we have a system with high volume bulk load, for now we load 10 k data per second
and retain data for 10 days...
But in future we need scale our system and should handle 50 k data per second
and we should retain data for 20 days.
For now, we use Vertica in single Node, and after 5 or 6 TB, simple queries like count(*) take long time.
I know clustering is good for high availability, but we need performance optimization in 2 phase:
bulk loading and query execution.
Is Clustering increase performance in bulk loading and query execution?
0
Comments
Yes definately cluster would give you better query performance and load performance than a single node cluser as volumn of data increase. highly recommended that you plan for it .
@skamat
does Vertica have linear scalability?
Bulk loading data can also be done in a stand by node that will not affect the cluster query performance.
Vertica scales very well as you add nodes. keep in mind to optimize projections
@hoseiney ,
As we have experienced, clustering really helpful when you target to have 50k rows per sec. Just remember to have your cluster to have well balanced projections and do run performance scripts, vioperf, vcpuperf and vnetperf.
Intercluster connectivity is very important as during your Trickle loads and bulk loads, inter-node communication and their speed is a very important factor.
Regards,
Raghav Agrawal
Vertica should scale lineally, if you double the data and double the cluster the performance should be similar, not always adding nodes to the same data make the cluster faster, it will depends of the table size.
However you said "performance of like count(*) take long time.", that may be a projection design issue that you can improve by putting in the order by of the projections the keys that you want to count ( group by keys).