HI Team, am new to vertica and would like to understand how DBD works. when do we need to run the dbd. i.e 1) before loading the data or after loading the data. do we need to re run the DBD to improve query performance?
The HP Vertica Database Designer is a UI-based tool that:
Analyzes your logical schema, sample data, and, optionally, your sample queries,creates a physical schema design (a set of projections) that can be deployed automatically (or manually). It can be used by anyone without specialized database knowledge (even business users can run Database Designer). It can be run and re-run anytime for additional optimization without stopping the database.
You can launch the Database Designer by selecting Configuration Menu -> Run Database Designer from the Admintools menu.
You can use Database Designer to create a comprehensive design, which allows you to create new projections for all tables in your database. You can also use Database Designer to create a query-specific design, which creates projections for all tables referenced in the queries you supply.
We van create a comprehensive design for a new database after you have loaded representative data into it. You can use the comprehensive design process to redesign a database when necessary (for example, after you have made significant changes to the database's schemas). Database Designer creates a complete initial or replacement physical schema design based on data statistics and queries. It creates segmented superprojections for large tables when deploying to multiple node clusters, and creates replicated superprojections for smaller tables
If you have new queries that you want to optimize, you can create an enhanced design with additional projections that are tuned for those queries. The query-specific design that you create in this procedure is optimized to balance query performance and compression for the provided query.
Hello renumetukuru , The Database Designer provided by Vertica helps you to administer vertica database cluster. I will not go deep into the administration part and directly come to your question It is recommended by Vertica that you should have at least 10GB of data in your database to get a optimized design by DBD after running. This means, You first need to load a good amount of data and then run the DBD, after which DBD will give you a .sql file containing optimized design of database. Its upto you, if you want to deploy the design directly or manually tweak some changes in the .sql and then deploy. The new design by DBD is created only after analyzing your data inside database, and queries you provide during dbd run, thus created projection ddl for you which is optimized according to your queries, you then deploy these new projections and your queries use them to run faster. So only after you load data, run dbd. Even if you rerun the DBD it will again give you a new design which can be/ cannot be more optimized then the previous design. Hope this helps
Thank you Naveen. So if am not running DBD, the performance is not good..right? The reason am having is i have a database where 2,3 tables with 100k rows. when i ran same query in mysql it returns fast compared to vertica.. BTW i havent touched DBD so far.. i just logical schema and loaded data through jdbc and running queries...
Once you load data into Vertica tables, Vertica automatically create super-projections on the table. The super-projection contains all the columns of the actual table to answer your query.
E.g. If your table name is "ITEMS" and below sql query will list you all the projections available on the table.
SELECT * FROM PROJECTIONS WHERE projection_schema='Schema_Name' AND anchor_table_name = 'ITEMS';
-The Output of the above sql query will have a column 'is_super_projection' . Its value is either 'true' or 'false'. true- means its a superprojection and false- means its not a super projection.
Comments
E.g. If your table name is "ITEMS" and below sql query will list you all the projections available on the table.
SELECT * FROM PROJECTIONS WHERE projection_schema='Schema_Name' AND anchor_table_name = 'ITEMS';
-The Output of the above sql query will have a column 'is_super_projection' . Its value is either 'true' or 'false'. true- means its a superprojection and false- means its not a super projection.
Hope it helps
Regards,
Kushal Sen Laskar