Need some help to evluate Vertica
We have our operational data in SQL server and we need to build an analytical platform on top of that SQL database. So we are thinking Vertica will be a good option and but we need to evaluate it. I am pretty new to Vertica so if i can get some help then it will be great.
0
Comments
I am not expert but I have done work in Dataware-housing with DB2 and vertica and vertica is superb in terms of performance in DWH. I can give you some steps but it is better to follow other experts reply along with contacting with vertica pre-sales team.
In above case, you can follow below steps
(1)Create a database in vertica
(2)Check which tables are being used for reporting purpose
(3)Get those tables relevant data from sql database to vertica database.
(4)Identify which sql queries are generated for reporting (or analysis) purpose.
(5)Feed them to database-designer (DBD) of vertica which will tell which projections needs to be created for optimization of those queries.
Note : projections are actual physical storage in vertica. Table is a logical entity.
(6)Check performance of vertica platform and compare with SQL server.
(7)Please keep in mind that vertica is especially for analytic purpose so there should be very less updates/deletes and few insert/load cycles.
DWH applications follow a certain type of analogy, star schema is one of the most popular. I assume that you are familiar with it. There are additional steps to optimize queries like fact tables should be segmented while dimension tables should be replicated.
Vertica support MPP (massive parallel processing) so it is one more key which gives vertica strength to provide fast results by dividing workload on multiple machines (nodes).
When you are evaluating a database for analytic purpose, you should consider some factors on which you will be evaluating the product on.
1. Speed
2. Scalability
3. Ease of use
4. Analytical Features
Vertica has some awesome analytic functions and it was made from ground zero for analytic purpose.
Few differentiating factors for Vertica:
- A pure Columnar database
- Shared nothing architecture
- Massively parallel processing
- Peer to Peer clustering
- Inbuilt load balancer
- a massive pool of advanced analytic functions
- New features supporting real-time aggregations
- Support for Semi structured data(KEY-VALUE) within Database
- Ability to extend the functionality using UDF(User defined Functions)
- A lot more......
At the end the evaluation depends on how you depict your use cases in the Database.Try evaluating Vertica with your current requirements(SELECT Queries).
To add, Loading data from SQL Server to Vertica is also supported using some User Defined Loads.
You can know more about Vertica from this post
Hope this helps.
NC