A pre-join projection stores rows of a fact table joined with rows of dimension tables.
Why does the word pre-join projection have "pre" in the word?
Contrary to joining tables during query run time, pre-join projections optimize performance, with inner joins between tables being based on their primary- and foreign-key columns. See <https://my.vertica.com/docs/4.1/HTML/Master/10391.htm>.
The Pre-join projection basically reduces the joining time in any qeury by already joining the tables when loading.
Lets take an example :
Suppose table A and table B have primary-key foreigh key columns.
If we create these tables and execute a inner join on these two tables, then the query has to go with Inner join phase in the query execution, which can easiliy consume 20-30 % of your total query execution time(depending on type of join has join / merge join, sorting).
This query will not be executed on normal projections of these tables A_b0 and B_b0.
Suppose we create a single pre-join projections on table A and B together(single pre-join projection for both tables) then this projection will already join and keep the data intact and ready for querying for all your join queries.
when we execute the same query on these tables again, vertica will select this pre-join projections as the most optimized storage in query plan and this will reduce the 20-30 % of your query execution time, since query just has to go storage and fetch the data and no joining is needed again.
Good for joins and queries.
Effect the Loading time of data to Table A or B, since the joining and population of data in table A/B is done runtime while loading.
Also Pre-join projections comes with lots of restrictions, By default it enables the referential integrety constriant on these tables.
The 7.2 documentation lists prejoin projections as a deprecated feature. You may want to consider this before deciding to use them.