Cannot use meta function or non-deterministic function

shashank_pshashank_p Vertica Customer

Hello,
I'm trying to create table using a select query which has ROW_NUMBER function used to remove duplicate data to load into the new table. However I'm getting below error while table creation-
SQL Error [5781] [0A000]: [Vertica]VJDBC ERROR: Cannot use meta function or non-deterministic function in SEGMENTED BY expression
The issue is occurring only when I'm trying to create the table using the select query however the standalone select query is giving the output, something strange.
Can you guys guide me what I'm missing here? I didn't find much information regarding this error, also there is no timestamp column as well.

Tagged:

Best Answer

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    Answer ✓

    @shashank_p - Can you run the following SQL commands and provide the output so that we can see the data types in the table:

    SELECT table_schema, table_name, column_name, data_type, data_type_length FROM columns WHERE table_name = 'abc' ORDER BY ordinal_position;
    SELECT table_schema, table_name, column_name, data_type, data_type_length FROM view_columns WHERE table_name = 'abc' ORDER BY ordinal_position;
    

    Also, for fun, can you try:

    CREATE TABLE Table1 AS
    SELECT *
    FROM
    (
    SELECT *,
    ROW_NUMBER() OVER ( PARTITION BY REGEXP_REPLACE(LOWERB(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' ||
    REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC) AS rank
    FROM abc
    )A WHERE rank = 1;
    

    I changed the LOWER function to use LOWERB. LOWER is non-deterministic.

    Or try this, it's a better way of doing what you are trying to do...

    CREATE TABLE Table1 AS
    SELECT col1, col2, col3, col4, 1 rank
    FROM abc
    LIMIT 1 OVER(PARTITION BY REGEXP_REPLACE(LOWERB(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' || REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC);
    

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited January 2021

    The error simply means that you can't use function in the segmentation clause is non-deterministic (i.e. the function may return different results each time it is called with a specific set of input values even if the database state that they access remains the same).

    Example:

    dbadmin=> CREATE TABLE no_way (c INT) SEGMENTED BY  HASH(RANDOMINT(10)) ALL NODES;
    ERROR 5781:  Cannot use meta function or non-deterministic function in SEGMENTED BY expression
    
    

    Can you post the CREATE TABLE statement?

  • shashank_pshashank_p Vertica Customer
    edited January 2021

    That is surprising, same function can return different output for same data. In that case why select statement is giving same results?
    CREATE TABLE Table1 AS
    SELECT *
    FROM
    (
    SELECT *,
    ROW_NUMBER() OVER ( PARTITION BY REGEXP_REPLACE(LOWER(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' ||
    REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC) AS rank
    FROM abc
    )A WHERE rank = 1

    Two things to be noted here,
    1. Without create statement, select is working completely fine and same output count always.
    2. If I apply limit to above create statement then this create also works!! for example, if select is returning 100 records then if I apply limit 200 while create statement then this works.

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    Can you post the DDL for the abc table?

    SELECT export_objects('', 'abc');

  • shashank_pshashank_p Vertica Customer
    edited January 2021

    SQL Error [5297] [0A000]: [Vertica]VJDBC ERROR: Unsupported use of LIMIT/OFFSET clause
    Getting error while running the statement for export objects, I'm using dbeaver.

  • shashank_pshashank_p Vertica Customer

    Do you need column data types which are involved in above operation or anything in specific?

  • Sudhakar_BSudhakar_B Vertica Customer ✭✭

    @shashank_p ,
    What is the version of Vertica?
    I ran the CTAS you provided with sample data, and it works fine for me. Second time your error is different.
    Please note in my case the table and underlying projections are automatically created using defaults.
    I am running on VMWare version single node Vertica 9.3.1 CE.
    Trying to see what is different in your case?

    [dbadmin@localhost ~]$ vsql
    Welcome to vsql, the Vertica Analytic Database interactive terminal.
    
    Type:  \h or \? for help with vsql commands
           \g or terminate with semicolon to execute query
           \q to quit
    
    dbadmin=> select version(), sysdate;
                  version               |          sysdate           
    ------------------------------------+----------------------------
     Vertica Analytic Database v9.3.1-0 | 2021-01-30 08:53:19.050386
    (1 row)
    
    dbadmin=> drop table if exists test;
    NOTICE 4185:  Nothing was dropped
    DROP TABLE
    dbadmin=> CREATE TABLE test AS
    dbadmin-> SELECT *
    dbadmin-> FROM
    dbadmin-> (
    dbadmin(> SELECT *,
    dbadmin(> ROW_NUMBER() OVER ( PARTITION BY REGEXP_REPLACE(LOWER(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' ||
    dbadmin(> REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC) AS rank
    dbadmin(> FROM (
    dbadmin(> select 'abcd' col1, 'werweww23' col2, 'wewiej2938u23' col3, 23 col4 from dual
    dbadmin(> union
    dbadmin(> select 'wv45' col1, 'erteb' col2, '7897ghj' col3, 2 col4 from dual
    dbadmin(> union
    dbadmin(> select 'abcd' col1, 'werweww23' col2, 'wewiej2938u23' col3, 23 col4 from dual
    dbadmin(> ) abc
    dbadmin(> )A WHERE rank = 1;
    CREATE TABLE
    dbadmin=> select export_objects('','test');
    
    export_objects                         
    ----------------------------------------------------------------------------------------
    CREATE TABLE public.test
    (
        col1 varchar(4),
        col2 varchar(9),
        col3 varchar(13),
        col4 int,
        rank int
    );
    
    
    CREATE PROJECTION public.test_super /*+basename(test),createtype(A)*/ 
    (
     col1,
     col2,
     col3,
     col4,
     rank
    )
    AS
     SELECT test.col1,
            test.col2,
            test.col3,
            test.col4,
            test.rank
     FROM public.test
     ORDER BY test.col1,
              test.col2,
              test.col3,
              test.col4,
              test.rank
    SEGMENTED BY hash(test.col1, test.col4, test.rank, test.col2, test.col3) ALL NODES OFFSET 0;
    
    
    SELECT MARK_DESIGN_KSAFE(0);
    
    (1 row)
    
    dbadmin=> select * from test;
     col1 |   col2    |     col3      | col4 | rank 
    ------+-----------+---------------+------+------
     abcd | werweww23 | wewiej2938u23 |   23 |    1
     wv45 | erteb     | 7897ghj       |    2 |    1
    (2 rows)
    
    dbadmin=> \q
    [dbadmin@localhost ~]$ 
    
  • shashank_pshashank_p Vertica Customer

    I'm using Vertica 8.0 version.

    The second error which I posted is for export_objects statement which @Jim_Knicely asked me to run. I can see you got output if you try to export objects for the particular table you created however I'm getting error for my table which I created.
    SQL Error [5297] [0A000]: [Vertica]VJDBC ERROR: Unsupported use of LIMIT/OFFSET clause - (Error when I run export_objects statement)

    It is not able to create default projection, I'm not sure why this is happening. All columns in the table are having VARCHAR as the data type.

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    @shashank_p - Can you log into the DB using vsql? Can you run this meta-command to descibe the table:

    \d abc

    This works for me on Vertica 8.0:

    dbadmin=> SELECT version();
                  version
    ------------------------------------
     Vertica Analytic Database v8.0.0-0
    (1 row)
    
    dbadmin=> CREATE TABLE jim_abc (col1 VARCHAR, col2 VARCHAR, col3 VARCHAR, col4 VARCHAR);
    CREATE TABLE
    
    dbadmin=> INSERT INTO jim_abc SELECT 'A', sysdate, 'A', 'A';
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> INSERT INTO jim_abc SELECT * FROM jim_abc;
     OUTPUT
    --------
          1
    (1 row)
    
    dbadmin=> CREATE TABLE jim_Table1 AS
    dbadmin-> SELECT *
    dbadmin-> FROM
    dbadmin-> (
    dbadmin(> SELECT *,
    dbadmin(> ROW_NUMBER() OVER ( PARTITION BY REGEXP_REPLACE(LOWER(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' ||
    dbadmin(> REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC) AS rank
    dbadmin(> FROM jim_abc
    dbadmin(> )A WHERE rank = 1;
    CREATE TABLE
    
    dbadmin=> SELECT * FROM jim_Table1;
     col1 |            col2            | col3 | col4 | rank
    ------+----------------------------+------+------+------
     A    | 2021-01-30 13:21:14.891517 | A    | A    |    1
    (1 row)
    
  • shashank_pshashank_p Vertica Customer
    Unfortunately I don’t have access to log into database directly using vsql. The only option I have is to use dbeaver due to security constraints.
  • shashank_pshashank_p Vertica Customer
    This issue is really strange, if it was data issue then it should fail always but somehow it fails conditionally. But I can definitely say it is not able to create default projections only for this particular case.
  • Sudhakar_BSudhakar_B Vertica Customer ✭✭

    @shashank_p ,
    You can install VSQL windows or LINUX client on you local machine and connect to DB using VSQL. client on you laptop (client) will use the same credentials for VSQL or for dbeaver (or any other client tools like Squirrel, DBVisualizer etc.). No security concern there!

    @Jim_Knicely is requesting VSQL to eliminate additional dependencies like dbeaver. Some of these tools have their own issues.
    You can download VSQL client and install on you local m/c.

    Having said that, the table (relation) abc in your original post, is that a Table or a view?
    The definition of abc is important in this case!
    Best Regards

  • shashank_pshashank_p Vertica Customer
    edited February 2021

    > Thank you, your workaround worked for me!

    >

    > 1. For the existing query, by changing LOWER to LOWERB I was able to create the table.

    > Can you please explain a bit more the practical difference between LOWER and LOWERB? Is it recommended to use LOWERB always?

    >

    > 2. The modified query which you gave (saying better way of doing it) worked with LOWER function itself and this way is really cool, we don't need to bother about the extra rank column and specifying column names to eliminate rank column it in subsequent steps.

    >

    > 3. Output of Query which you asked, it is a table so attaching the result. I have synced the column names which we use in the query.

    1. Do you still want me to check with vsql client and post the result?
  • shashank_pshashank_p Vertica Customer

    'abc' here is a table.

    Attaching output of query which you asked for.

    You workaround worked! Thank you @Jim_Knicely :smile:

    I used LOWERB instead of LOWER and same query worked.

    Can you please explain the practical difference between LOWERB and LOWER? Is it recommended to use LOWERB always?

    The modified query which you gave (better way of doing as you said) worked with both LOWER & LOWERB. This modified query is so cool, I don't need to bother about the rank column.

    @Sudhakar_B & @Jim_Knicely , Now do you want me to still check with VSQL client and post you the output with original query? I'm trying to setup this client in my system.

  • Sudhakar_BSudhakar_B Vertica Customer ✭✭

    Glad this worked @shashank_p .
    Very interesting though!
    CTAS target table structure (and segmentation columns) is dependent on immutability of functions used in "underlying select".
    Especially since Rank was the last column in select and would not even have been part of segmentation columns (with default 8 columns).
    Just curious,
    What would happen if @shashank_p created the table first and then do an insert?

    Create table Table1 as select * from abc where 1=2;
    Insert into Table1 select * from abc limit 1 over...
    

    @Jim_Knicely ,
    Was this a something on 8.0 and later got fixed? OR is it data dependent? I couldn't recreate the error in 9.3.x

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited February 2021

    @shashank_p - LOWERB returns a character string with each ASCII character converted to lowercase. Multi-byte characters are skipped and not converted. Do you have Multi-byte characters in your data?

    https://www.vertica.com/docs/latest/HTML/Content/Authoring/SQLReferenceManual/Functions/String/LOWERB.htm

    If I were you, I'd use the LIMIT Analytic Function. It's typically much faster than trying to use RANK.

    https://forum.vertica.com/discussion/239345/vertica-quick-tip-the-limit-analytic-function

    I'd also consider upgrading Vertica! Version 10.1 will be out very soon (i.e. most likely today)!

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited February 2021

    @Sudhakar_B - The only function in @shashank_p's CTAS statement that could be the problem is LOWER.

    Example:

    dbadmin=> CREATE TABLE no_way (c VARCHAR) SEGMENTED BY HASH(LOWER(c)) ALL NODES;
    ERROR 5781:  Cannot use meta function or non-deterministic function in SEGMENTED BY expression
    
    

    But I have no idea why the auto projection seems to have been trying to include the LOWER function in the segmentation clause.

    I can not replicate the issue even on a Vertica 8.0 database.

    @shashank_p - If you wouldn't mind, can you post the DDL for the projection that gets created after you the version of the CTAS using LOWERB?

    SELECT export_objects('', 'Table1');

    Curious how the HASH segmentation is defined...

  • shashank_pshashank_p Vertica Customer

    @Jim_Knicely I'm not able to run export object query in DBeaver. Does that statement run only in vsql client console?

    Coming to the multi-byte, I'm not sure whether my data has multi-byte characters but as per logical understanding of the data it should not have multi-byte characters. (Since col1 and col2 are just names)

    Upgrading vertica is something which admin team is taking care, hope it will be done soon.

    @Sudhakar_B , I have also tried to create table first and then insert the data but faced same issue that's why I have approached you guys.

  • Sudhakar_BSudhakar_B Vertica Customer ✭✭

    @Jim_Knicely
    I was testing this out for a demonstrable use case for the following:

    @Jim_Knicely said:
    If I were you, I'd use the LIMIT Analytic Function. It's typically much faster than trying to use RANK.

    However, I am finding both scenario using identical explain plan. Repeated profiling of the both versions of the queries show very similar timing.
    Just wondering for which use-case the LIMIT OVER would be faster that RANK. Please see below.
    With RANK

    explain  select * from (SELECT *,
      ROW_NUMBER() OVER ( PARTITION BY REGEXP_REPLACE(LOWER(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' ||
      REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC) AS rank
      FROM (
     select 'abcd' col1, 'werweww23' col2, 'wewiej2938u23' col3, 23 col4 from dual
      union
      select 'wv45' col1, 'erteb' col2, '7897ghj' col3, 2 col4 from dual
      union
      select 'abcd' col1, 'werweww23' col2, 'wewiej2938u23' col3, 23 col4 from dual
      ) abc
      ) A WHERE rank = 1;
    
     Access Path:
     Filter: (A.rank = 1)
     +-ANALYTICAL [Cost: 32, Rows: 3] (PATH ID: 8)
     |  Analytic Group
     |   Functions: row_number()
     |   Group Local Resegment: (((regexp_replace(lower(btrim(abc.col1)), '[^a-z0-9 ]', '', 1, 0, '') || ' ') || regexp_replace(lower(btrim(abc.col2)), '[^a-z0-9 ]', '', 1, 0, '')) || abc.col3)
     |   Group Sort: (((regexp_replace(lower(btrim(abc.col1)), '[^a-z0-9 ]', '', 1, 0, '') || ' ') || regexp_replace(lower(btrim(abc.col2)), '[^a-z0-9 ]', '', 1, 0, '')) || abc.col3) ASC, abc.col4 DESC NULLS FIRST
     |   TopK Optimized: K=1, PB Cols = 1
     |  Output Only: 1 tuples
     | +---> GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 31, Rows: 3] (PATH ID: 10)
     | |      Group By: "*SELECT* 1".col1, "*SELECT* 1".col2, "*SELECT* 1".col3, "*SELECT* 1".col4
     | | +---> UNION [Cost: 30, Rows: 3] (PATH ID: 11)
     | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 13)
     | | | |      Projection: v_catalog.dual_p
     | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 15)
     | | | |      Projection: v_catalog.dual_p
     | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 17)
     | | | |      Projection: v_catalog.dual_p
    

    With LIMIT OVER

    explain  select * from (SELECT *
       FROM (
     select 'abcd' col1, 'werweww23' col2, 'wewiej2938u23' col3, 23 col4 from dual
      union
      select 'wv45' col1, 'erteb' col2, '7897ghj' col3, 2 col4 from dual
      union
      select 'abcd' col1, 'werweww23' col2, 'wewiej2938u23' col3, 23 col4 from dual
      ) abc
      ) A 
      limit 1 over ( PARTITION BY REGEXP_REPLACE(LOWER(TRIM(col1)), '[^a-z0-9 ]', '') || ' ' ||
      REGEXP_REPLACE(LOWER(TRIM(col2)), '[^a-z0-9 ]', '')||col3 ORDER BY col4 DESC) ;
    
     Access Path:
     +-SELECT  LIMIT 1 [Cost: 32, Rows: 1] (PATH ID: 6)
     | +---> ANALYTICAL [Cost: 32, Rows: 3] (PATH ID: 7)
     | |      Analytic Group
     | |       Functions: row_number()
     | |       Group Local Resegment: (((regexp_replace(lower(btrim(abc.col1)), '[^a-z0-9 ]', '', 1, 0, '') || ' ') || regexp_replace(lower(btrim(abc.col2)), '[^a-z0-9 ]', '', 1, 0, '')) || abc.col3)
     | |       Group Sort: (((regexp_replace(lower(btrim(abc.col1)), '[^a-z0-9 ]', '', 1, 0, '') || ' ') || regexp_replace(lower(btrim(abc.col2)), '[^a-z0-9 ]', '', 1, 0, '')) || abc.col3) ASC, abc.col4 DESC NULLS F
    IRST
     | |       TopK Optimized: K=1, PB Cols = 1
     | |      Output Only: 1 tuples
     | | +---> GROUPBY HASH (LOCAL RESEGMENT GROUPS) [Cost: 31, Rows: 3] (PATH ID: 9)
     | | |      Group By: "*SELECT* 1".col1, "*SELECT* 1".col2, "*SELECT* 1".col3, "*SELECT* 1".col4
     | | | +---> UNION [Cost: 30, Rows: 3] (PATH ID: 10)
     | | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 12)
     | | | | |      Projection: v_catalog.dual_p
     | | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 14)
     | | | | |      Projection: v_catalog.dual_p
     | | | | +---> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 16)
     | | | | |      Projection: v_catalog.dual_p
    
  • shashank_pshashank_p Vertica Customer
    @Sudhakar_B Adding to your point, @Jim_Knicely I tried using limit function however the query was not fast enough in-fact either it took same time as rank or in some cases more time.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file