NEXTVAL() function - how to invoke it on a record level by condition?

ViatorViator Community Edition User

I'm using Vertica Analytic Database v9.2.1-20,
and trying to get next sequence value by condition on a record level, with NEXTVAL() function.
here is the example :

CREATE SEQUENCE v_seq START 1;
CREATE TABLE test_seq (id INT, flag int);
INSERT INTO test_seq VALUES (1, 0);
INSERT INTO test_seq VALUES (2, 1);
INSERT INTO test_seq VALUES (3, 1);
INSERT INTO test_seq VALUES (4, 0);
INSERT INTO test_seq VALUES (5, 0);
INSERT INTO test_seq VALUES (6, 1);
INSERT INTO test_seq VALUES (7, 1);
SELECT id, flag,
CASE
WHEN flag = 0 THEN NEXTVAL('v_seq')
ELSE CURRVAL('v_seq')
END as group_id
FROM test_seq
ORDER BY ID;

The expected results :
id flag group_id
1 0 1
2 1 1
3 1 1
4 0 2
5 0 3
6 1 3
7 1 3

The actual results :

id flag group_id
1 0 1
2 1 2
3 1 3
4 0 4
5 0 5
6 1 6
7 1 7
It looks like the NEXTVAL() function is being invoked regardless WHEN condition.
Is there any way to invoke it by condition ?
Thanks!

Tagged:

Answers

  • Jim_KnicelyJim_Knicely - Select Field - Administrator

    I would just do this:

    dbadmin=> SELECT id, flag, conditional_change_event(flag) OVER (ORDER BY id) AS grouping_id FROM test_seq ORDER BY id;
     id | flag | grouping_id
    ----+------+-------------
      1 |    0 |           0
      2 |    1 |           1
      3 |    1 |           1
      4 |    0 |           2
      5 |    0 |           2
      6 |    1 |           3
      7 |    1 |           3
    (7 rows)
    
    
  • marcothesanemarcothesane - Select Field - Administrator

    I think he wants CONDITIONAL_TRUE_EVENT() if I got that right, Jim ...

        WITH
        indata(id,flag,expected_id) AS (
                  SELECT 1,0,1
        UNION ALL SELECT 2,1,1
        UNION ALL SELECT 3,1,1
        UNION ALL SELECT 4,0,2
        UNION ALL SELECT 5,0,3
        UNION ALL SELECT 6,1,3
        UNION ALL SELECT 7,1,3
        )
        SELECT
          *
        , CONDITIONAL_TRUE_EVENT(flag=0)
          OVER(ORDER BY id) AS obtained_id
        FROM indata;
             id | flag | expected_id | obtained_id
        ----+------+-------------+-------------
          1 |    0 |           1 |           1
          2 |    1 |           1 |           1
          3 |    1 |           1 |           1
          4 |    0 |           2 |           2
          5 |    0 |           3 |           3
          6 |    1 |           3 |           3
          7 |    1 |           3 |           3
    
    
  • ViatorViator Community Edition User

    Dear Jim
    Thanks for fast response. The question raised exactly as an alternative way to avoid using methods like **conditional_change_event() **function. The matter is that "conditional_change_event" requires OVER clause , and in our real query with 10B rows the performance slows down significantly. So the question is whether it is possible to use NEXTVAL() function by condition, because it can spare us additional OVER (ORDER BY... )

  • marcothesanemarcothesane - Select Field - Administrator

    Then, try this:

    Just update with seq.nextval the rows WHERE flag=0. Then, explore the table for candidate for a PARTITION BY in the window clause, and update, in a second mass update, using LAST_VALUE(group_id IGNORE NULLS) OVER(PARTITION BY <what you found> ORDER BY id) ...

  • ViatorViator Community Edition User

    Thank you, Marcothesane
    I assume, that the update operation for 10B rows will take even more time than using CONDITIONAL_TRUE_EVENT / CONDITIONAL_CHANGE_EVENT ...

    I just want to understand that the approach with NEXTVAL is applicable in general. This behavior doesn't make sense, why NEXTVAL is invoked, when condition is FALSE ?

  • marcothesanemarcothesane - Select Field - Administrator

    I can' t answer exactly why this works how it works. It must have something to do with how sequences are cached in sessions, and would in a strange way be consistent with the fact that you can' t call seq.currval in a session before you' ve called seq.nextval.
    Looks like you need a workaround using CONDITIONAL_TRUE_EVENT() after all.

    You might have noticed that the sequence progresses differently from node to node in a multi-node cluster. So my suggestion would be consistent with that.

    Find out how the super projection of the table is segmented - a big table will be segmented.

    Run SELECT EXPORT_OBJECTS('','schema_name.tb_name' ,FALSE); and you'll see the SEGMENTED BY HASH() clause.
    Then use that in the PARTITION BY clause for the OLAP function call as I do below. That will parallelise your process. You're slow without the PARTITION BY in the OLAP call because without it, it's forced to run single-threaded. So it's really worth a try. Please let us know if it helped - I'm curious myself.:

      WITH
        indata(id,flag,expected_id) AS (
                  SELECT 1,0,1
        UNION ALL SELECT 2,1,1
        UNION ALL SELECT 3,1,1
        UNION ALL SELECT 4,0,2
        UNION ALL SELECT 5,0,3
        UNION ALL SELECT 6,1,3
        UNION ALL SELECT 7,1,3
        )
        SELECT
          *
        , CONDITIONAL_TRUE_EVENT(flag=0)
          OVER(PARTITION BY HASH(<the segmentation_columns_you_found> ) ORDER BY id) AS obtained_id
        FROM indata;
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file