Vertica Sizing recommendation with SSD drives

I was looking for recommendation on a new production cluster that we need to configure based on following parameters.
1) Total raw data volume approx. 8 TB
2) Need extremely fast reporting performance for our dashboards and queries which could be very complex
3) High availability

We were thinking of using SSD drive so as to get very high I/O. Here are the questions that I have.
A) What is the recommended Hardware ? How many nodes?
B) How many SSD drives per node and what size?
C) Does SSD drive require RAID 10 or RAID 5 setup ?
D) In your testing with SSD drive how do they compare with normal drives? How does drive failure rate compare?
E) Do you have any performance benchmarks using SSD drive?
F) Any other information that you can provide for us to consider?
G) What kind of SSD drive ( make , model etc ) that you recommend ?
H) How much growth of data can the recommended H/W can accommodate?
I) Currently our application uses Vertica version V 7.2.3-22. Would it help in terms of performance if we use Vertica 8.1.1-X without changing any application code?

Thanks,

Hemen

Comments

  • edited January 2018

    Hi!

    1. A,C,D: some info you can find here or here.
    2. A: as I know Vertica team recommends HPE Proliant DL380 GenX servers(but it will be strange if HPE will recommend DELL servers for example ;) )
    3. I: pure SQL code should work without any changes. Its recommended to upgrade, because support for Vertica 7.2.3 ended in Fall 2017. About performance: theoretically version 8.x should be is faster than 7.x. My simple benchmark(based on TCP benchmark) shows no performance degradation and in some cases even small improvements.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file