Do please check if there are any available vertica function can be used for this scenario

AlokTAlokT Community Edition User

Hello,

We are using vertica-9.3.1 and EON Mode.

smartech=> SELECT * FROM mlai_test.sinarsmt_final_results_200611_200829 where uid=384;

uid
D3_decay_month
D1_decay_week
deliveryNoFilter
deliveryFilter
appActModel
sessionModel_activity
sessionModel_duration
nd_wd_pred
nd_we_pred
wd1_wd_pred
wd1_we_pred
md1_wd_pred
md1_we_pred
m2d1_wd_pred
m2d1_we_pred
wd3_wd_pred
wd3_we_pred
md3_wd_pred
md3_we_pred
m2d3_wd_pred
m2d3_we_pred
prp_wd_pred
prp_we_pred
D3_decay_month2
D3_decay_week
D1_decay_month
D1_decay_month2
-----+----------------+---------------+------------------+----------------+-------------+-----------------------+-----------------------+------------+------------+-------------+-------------+-------------+-------------+--------------+--------------+-------------+-------------+-------------+-------------+--------------+--------------+-------------+-------------+-----------------+---------------+----------------+-----------------
384
15
15
15
9
15
15
12
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15

As per above observation we need to finally create like below which is the desired result
uid
hour
384
15

The output is 15 which has repeated most number of times in the above row

We have written one simple query which uses case statement to calculate that but its really lengthy process and time consuming

SELECT uid, D3_decay_month_0+D1_decay_week_0+deliveryNoFilter_0 as final_0, D3_decay_month_3+D1_decay_week_3+deliveryNoFilter_3 as final_3, D3_decay_month_6+D1_decay_week_6+deliveryNoFilter_6 as final_6, D3_decay_month_9+D1_decay_week_9+deliveryNoFilter_9 as final_9 FROM (SELECT uid, CASE WHEN D3_decay_month=0 THEN 1 ELSE 0 END as D3_decay_month_0, CASE WHEN D3_decay_month=3 THEN 1 ELSE 0 END as D3_decay_month_3, CASE WHEN D3_decay_month=6 THEN 1 ELSE 0 END as D3_decay_month_6, CASE WHEN D3_decay_month=9 THEN 1 ELSE 0 END as D3_decay_month_9, CASE WHEN D1_decay_week=0 THEN 1 ELSE 0 END as D1_decay_week_0, CASE WHEN D1_decay_week=3 THEN 1 ELSE 0 END as D1_decay_week_3, CASE WHEN D1_decay_week=6 THEN 1 ELSE 0 END as D1_decay_week_6, CASE WHEN D1_decay_week=9 THEN 1 ELSE 0 END as D1_decay_week_9, CASE WHEN deliveryNoFilter=0 THEN 1 ELSE 0 END as deliveryNoFilter_0, CASE WHEN deliveryNoFilter=3 THEN 1 ELSE 0 END as deliveryNoFilter_3, CASE WHEN deliveryNoFilter=6 THEN 1 ELSE 0 END as deliveryNoFilter_6, CASE WHEN deliveryNoFilter=9 THEN 1 ELSE 0 END as deliveryNoFilter_9 FROM mlai_test.sinarsmt_final_results_200611_200829) temp;

After this we just need to use greatest.

Do please check with your team if there are any available vertica function can be used for this scenario.

Answers

  • marcothesanemarcothesane - Select Field - Administrator

    The question is not quite clear to me. What I think I understand is that you have a table with an identifier , which you call uid, and then 27 integers.

    The result seems to be the integer value that occurs most often, which is 15 in your case, and you name that hour.

    If that is what you want, then I would pivot the table, and find the number with the highest number of occurrences.

    I do it with 5 of the 27 integers.

    WITH
    -- your input, first 6 columns
    indata(uid,D3_decay_month,D1_decay_week,deliveryNoFilter,deliveryFilter,appActModel) AS (
    SELECT 384,15,15,15,9,15
    )
    -- end of input, real query starts here, replace 
    -- comma below with "WITH" ..
    ,
    -- need a series of integers from 2 to 6
    -- 2 to 25 in your real scenario
    i(i) AS (
              SELECT 2
    UNION ALL SELECT 3
    UNION ALL SELECT 4
    UNION ALL SELECT 5
    UNION ALL SELECT 6
    )
    ,
    pivoted AS (
      SELECT
        uid
      , CASE i 
          WHEN 2 THEN d3_decay_month
          WHEN 3 THEN d1_decay_week
          WHEN 4 THEN deliverynofilter
          WHEN 5 THEN deliveryfilter
          WHEN 6 THEN appactmodel
        END AS val
      FROM indata CROSS JOIN i
    )
    ,
    grp AS (
      SELECT
        uid
      , val
      , COUNT(*) AS occ_count
      FROM pivoted
      GROUP BY
        uid
      , val
    )
    SELECT 
      uid
    , val AS "hour"
    FROM grp
    ORDER BY occ_count DESC
    LIMIT 1;
    -- out  uid | hour 
    -- out -----+------
    -- out  384 |   15
    

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file