Same Table and Projection Name

ersaurabhexersaurabhex Registered User

I was able to create same super projection with same name of table name. I also created custom projection on the same table with appropriate select columns and sorting on them.
This project was running for two weeks, where I tend to load the date into table using an ETL process and later query that table for our reporting use.

Recently after load completion, I found that query: select * from table_name is returning result but select specific_columns from table is not showing any result. Using explain, I found select * from table is pointing to super projection (which has same name as of table name) and select specific_columns from table is pointing to custom projection.

Below is suspecting reason for this behavior -
1. Vertica optimizer was not able to identify during insert statement that it's inserting to a table or directly to a projection and it did direct insertion into the super projection (which has same name as of table name)
2. Here, table is left blank and all data is with projection only.
3. When I executed select * from table_name, it eventually fired as select * from projection_name, because both "names" are same.
4. When I fired select specific_column from table, optimizer tries to identify the data from custom projection (which was blank because table was not loaded (instead projection was loaded in #3)) no record returned.
5. If my suspicion is true, why did vertica (Version 9.1) allowed me to create same projection name as of table name and if it did how can optimizer distinguish between the two while DDL operations.

Finally I dropped the custom projection, rename the super projection, reload the table and everything starts working without any issue.

Regards,
Saurabh

Comments

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited August 29

    Can you send example code? I'm not able to create a projection that has the same name as a table in the same schema...

    dbadmin=> SELECT version();
                  version
    ------------------------------------
     Vertica Analytic Database v9.1.1-0
    (1 row)
    
    dbadmin=> CREATE TABLE test_table (c1 INT, c2 INT);
    CREATE TABLE
    
    dbadmin=> CREATE PROJECTION test_table AS SELECT * FROM test_table;
    ROLLBACK 4213:  Object "test_table" already exists
    
    dbadmin=> CREATE SCHEMA some_schema;
    CREATE SCHEMA
    
    dbadmin=> CREATE PROJECTION some_schema.test_table AS SELECT * FROM test_table;
    ROLLBACK 8139:  Projection must be created in the same schema as its anchor table
    

    Are you sure your not referring to a table with the same name but in different schemas?

    dbadmin=> INSERT INTO test_table SELECT 1, 2;
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> SELECT projection_schema, projection_name FROM projections WHERE anchor_table_name = 'test_table';
     projection_schema | projection_name
    -------------------+------------------
     public            | test_table_super
    (1 row)
    
    dbadmin=> CREATE TABLE test_table AS SELECT * FROM test_table;
    ROLLBACK 4213:  Object "test_table" already exists
    
    dbadmin=> CREATE TABLE some_schema.test_table AS SELECT * FROM test_table;
    CREATE TABLE
    
    dbadmin=> SELECT projection_schema, projection_name FROM projections WHERE anchor_table_name = 'test_table';
     projection_schema | projection_name
    -------------------+------------------
     public            | test_table_super
     some_schema       | test_table_super
    (2 rows)
    
  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited August 29

    @ersaurabhex - Maybe you mean the projection's "base name" is the same as a table?

    Example:

    dbadmin=> CREATE TABLE dt1(c1 INT, c2 INT);
    CREATE TABLE
    
    dbadmin=> CREATE PROJECTION dt1 AS SELECT * FROM dt1 ORDER BY c1 SEGMENTED BY HASH(c1) ALL NODES;
    CREATE PROJECTION
    
    dbadmin=> SELECT projection_schema, projection_name, projection_basename, anchor_table_name FROM projections WHERE anchor_table_name = 'dt1';
     projection_schema | projection_name | projection_basename | anchor_table_name
    -------------------+-----------------+---------------------+-------------------
     public            | dt1_b0          | dt1                 | dt1
     public            | dt1_b1          | dt1                 | dt1
    (2 rows)
    

    Note that the "projection name " is NOT the same as the table, but the "base name" is :)

    Check out the doc page "Projection Naming" here:

    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/AdministratorsGuide/Projections/ProjectionNaming.htm

  • ersaurabhexersaurabhex Registered User

    Hello Jim.. Thanks, it looks like the second case because we have k=1 with 3 node cluster.

    What will happen when an insert command (with table name) will be used. Will it get into projection or table?

    Please also create custom projection with ksafe = 0 and try to insert the records to reproduce the scenario we faced.

    Thanks in Advance.
    Regards,
    Saurabh

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited August 30

    Hi,

    Technically data isn't stored in a table (Logical Schema). It is stored in the table's projections (Physical Schema) :)

    See:
    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/ConceptsGuide/Components/LogicalSchema.htm

    https://my.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/ConceptsGuide/Components/PhysicalSchema.htm

    But when you use an INSERT DML command you can only specify a table name, not a projection or view name. Vertica will handle under the hood the population of the projections of that table.

    Example:

    dbadmin=> SELECT designed_fault_tolerance, current_fault_tolerance FROM system;
     designed_fault_tolerance | current_fault_tolerance
    --------------------------+-------------------------
                            0 |                       0
    (1 row)
    
    dbadmin=> CREATE TABLE test (c1 INT) ORDER BY c1;
    CREATE TABLE
    
    dbadmin=> SELECT projection_name, verified_fault_tolerance FROM projections WHERE anchor_table_name = 'test';
     projection_name | verified_fault_tolerance
    -----------------+--------------------------
     test_super      |                        0
    (1 row)
    
    dbadmin=> INSERT INTO test SELECT 1;
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> INSERT INTO test_super SELECT 1;
    ERROR 4876:  Table "test_super" does not exist
    
    dbadmin=> CREATE VIEW test_vw AS SELECT * FROM test;
    CREATE VIEW
    
    dbadmin=> INSERT INTO test_vw SELECT 1;
    ERROR 2058:  INSERT into VIEW is not supported
    
  • ersaurabhexersaurabhex Registered User

    Thanks a lot Jim. This completely makes sense.
    Please also help me know the reason of possibility of below scenarios

    1. In what cases, custom projections won't have data but super projection do.
    2. In what cases, refresh() won't work for custom projections because I tried doing refresh many times but custom projection didn't get the data.
    3. Finally I dropped those custom projection, recreated them and refresh() then they started showing data. How can I find out the problem happening with #2.

    Regards,
    Saurabh

  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert
    edited August 31

    For scenario #1:

    A custom user created projection can be a super projection if it contains all of the columns of the anchor table. You can have 1 or more super projections per table.

    Example:

    dbadmin=> CREATE TABLE test_table (c1 INT, c2 INT) SEGMENTED BY HASH(c1) ALL NODES;
    CREATE TABLE
    
    dbadmin=> CREATE PROJECTION test_table_pr AS SELECT c2, c1 FROM test_table SEGMENTED BY HASH(c2) ALL NODES;
    CREATE PROJECTION
    
    dbadmin=> SELECT projection_name, projection_basename, is_super_projection FROM projections WHERE anchor_table_name = 'test_table';
     projection_name  | projection_basename | is_super_projection
    ------------------+---------------------+---------------------
     test_table_b0    | test_table          | t
     test_table_b1    | test_table          | t
     test_table_pr_b0 | test_table_pr       | t
     test_table_pr_b1 | test_table_pr       | t
    (4 rows)
    

    I user created projection created after the anchor table already has data will not have data, even if's super projection, until its refreshed.

    Example:

    dbadmin=> DROP TABLE test_table CASCADE;
    DROP TABLE
    
    dbadmin=> CREATE TABLE test_table (c1 INT, c2 INT) SEGMENTED BY HASH(c1) ALL NODES;
    CREATE TABLE
    
    dbadmin=> INSERT INTO test_table SELECT 1, 2;
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> CREATE PROJECTION test_table_pr AS SELECT c2 FROM test_table SEGMENTED BY HASH(c2) ALL NODES;
    WARNING 4468:  Projection <public.test_table_pr_b0> is not available for query processing. Execute the select start_refresh() function to copy data into this projection.
              The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
    WARNING 4468:  Projection <public.test_table_pr_b1> is not available for query processing. Execute the select start_refresh() function to copy data into this projection.
              The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
    CREATE PROJECTION
    
    dbadmin=> SELECT projection_name, projection_basename, is_super_projection, is_up_to_date FROM projections WHERE anchor_table_name = 'test_table';
     projection_name  | projection_basename | is_super_projection | is_up_to_date
    ------------------+---------------------+---------------------+---------------
     test_table_b0    | test_table          | t                   | t
     test_table_b1    | test_table          | t                   | t
     test_table_pr_b0 | test_table_pr       | f                   | f
     test_table_pr_b1 | test_table_pr       | f                   | f
    (4 rows)
    
  • Jim_KnicelyJim_Knicely Administrator, Moderator, Employee, Registered User, VerticaExpert

    For #2: Currently, you cannot refresh a projection when a node is down.

    Example:

    dbadmin=> SELECT node_name, node_state, node_address FROM nodes;
         node_name      | node_state | node_address
    --------------------+------------+---------------
     v_test_db_node0001 | UP         | 192.168.2.200
     v_test_db_node0002 | UP         | 192.168.2.201
     v_test_db_node0003 | UP         | 192.168.2.202
    (3 rows)
    
    dbadmin=> \! admintools -t kill_host -s 192.168.2.202
    *** Terminating vertica and performing host cleanup ***
            Terminating vertica processes on host '192.168.2.202'
    All signals sent successfully.
    
    dbadmin=> SELECT node_name, node_state, node_address FROM nodes;
         node_name      | node_state | node_address
    --------------------+------------+---------------
     v_test_db_node0001 | UP         | 192.168.2.200
     v_test_db_node0002 | UP         | 192.168.2.201
     v_test_db_node0003 | DOWN       | 192.168.2.202
    (3 rows)
    
    dbadmin=> SELECT refresh('test_table');
                                                                                                                                                                                                                                        refresh                                                                                                                                                                                               
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Refresh completed with the following outcomes:
    Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
    ----------------------------------------------------------------------------------------
    "public"."test_table_pr_b1": [test_table] [failed: Unable to refresh projections -- some nodes are not up] [] [1] [0]
    "public"."test_table_pr_b0": [test_table] [failed: Unable to refresh projections -- some nodes are not up] [] [1] [0]
    
    (1 row)
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file