Same Table and Projection Name
I was able to create same super projection with same name of table name. I also created custom projection on the same table with appropriate select columns and sorting on them.
This project was running for two weeks, where I tend to load the date into table using an ETL process and later query that table for our reporting use.
Recently after load completion, I found that query: select * from table_name is returning result but select specific_columns from table is not showing any result. Using explain, I found select * from table is pointing to super projection (which has same name as of table name) and select specific_columns from table is pointing to custom projection.
Below is suspecting reason for this behavior -
1. Vertica optimizer was not able to identify during insert statement that it's inserting to a table or directly to a projection and it did direct insertion into the super projection (which has same name as of table name)
2. Here, table is left blank and all data is with projection only.
3. When I executed select * from table_name, it eventually fired as select * from projection_name, because both "names" are same.
4. When I fired select specific_column from table, optimizer tries to identify the data from custom projection (which was blank because table was not loaded (instead projection was loaded in #3)) no record returned.
5. If my suspicion is true, why did vertica (Version 9.1) allowed me to create same projection name as of table name and if it did how can optimizer distinguish between the two while DDL operations.
Finally I dropped the custom projection, rename the super projection, reload the table and everything starts working without any issue.
Regards,
Saurabh
Comments
Can you send example code? I'm not able to create a projection that has the same name as a table in the same schema...
Are you sure your not referring to a table with the same name but in different schemas?
@ersaurabhex - Maybe you mean the projection's "base name" is the same as a table?
Example:
Note that the "projection name " is NOT the same as the table, but the "base name" is
Check out the doc page "Projection Naming" here:
https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/AdministratorsGuide/Projections/ProjectionNaming.htm
Hello Jim.. Thanks, it looks like the second case because we have k=1 with 3 node cluster.
What will happen when an insert command (with table name) will be used. Will it get into projection or table?
Please also create custom projection with ksafe = 0 and try to insert the records to reproduce the scenario we faced.
Thanks in Advance.
Regards,
Saurabh
Hi,
Technically data isn't stored in a table (Logical Schema). It is stored in the table's projections (Physical Schema)
See:
https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/ConceptsGuide/Components/LogicalSchema.htm
https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/ConceptsGuide/Components/PhysicalSchema.htm
But when you use an INSERT DML command you can only specify a table name, not a projection or view name. Vertica will handle under the hood the population of the projections of that table.
Example:
Thanks a lot Jim. This completely makes sense.
Please also help me know the reason of possibility of below scenarios
Regards,
Saurabh
For scenario #1:
A custom user created projection can be a super projection if it contains all of the columns of the anchor table. You can have 1 or more super projections per table.
Example:
I user created projection created after the anchor table already has data will not have data, even if's super projection, until its refreshed.
Example:
For #2: Currently, you cannot refresh a projection when a node is down.
Example: