We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Median in the moving window — Vertica Forum

Median in the moving window

Hello,

 

I recently started to use Vetrica and it was great so far! I was curious if there is a way to calculate MEDIAN value in the moving window. For example, if I want to compute 20 day average volume, it can be done as:

 

AVG(volume) OVER (PARTITION BY symbol ORDER BY date ROWS BETWEEN 19 PRECEDING AND CURRENT ROW) as adv20

...is there a way to compute 20 day median volume instead? From documentation it looks like that it is not possible: 'MEDIAN() does not allow the window_order_clause or window_frame_clause'., but was wondering if there is any workaround.

 

Thanks much in advance,

Konstantin

Comments

  • You are correct that median() won't work. 

     

    From the documentation:

    MEDIAN() does not allow the window-order-clause or window-frame-clause.

     

    You'd have to code something manually.  Statistically speaking, I'm not sure a rolling median would differ significantly from a rolling average, especially if you're only looking at 20 rows.  But then, I'm not a statistician.

     

  • Thanks for your response. If you think from trading perspective, 1 heavy day can easily skew your average, while median would be still in line with reality. Not sure, if it would be possible for Vertica to start supporting it, as I feel that it is a really useful feature (actually from the statistical perspective).

  • Ralf_SRalf_S Community Edition User

    Hello -
    just picking up on that,
    nth_value is a good idea but ordering the values in the window frame seems not be possible.
    I guess it comes down to writing a function, or similar. There is a postgres equivalent in this post
    I tried to translate into vertica syntax but not luck so far.
    Anyone would have advice on how to implement a rolling median?
    Thanks.

  • VValdarVValdar Vertica Employee Employee
    edited November 2022

    Hi Ralf_S,

    You can do the trick involving a complementary self-join.

    This is far from perfect I agree but still working:

      select t1.symbol, t1.date, t1.volume
           , approximate_median(t2.volume)
        from MyTable as t1
        join MyTable as t2  on t2.symbol  = t1.symbol
                           and t2.date   >= t1.date - interval '20 days'
                           and t2.date   <= t1.date
    group by t1.symbol, t1.date, t1.volume
    order by 1, 2;
    
    
      select distinct
             t1.symbol, t1.date, t1.volume
           , median(t2.volume) over(partition by t1.symbol, t1.date)
        from MyTable as t1
        join MyTable as t2  on t2.symbol  = t1.symbol
                           and t2.date   >= t1.date - interval '20 days'
                           and t2.date   <= t1.date
    order by 1, 2;
    

    Beware of the volume generated by the join.

  • Ralf_SRalf_S Community Edition User

    Hello VValdar,

    Thanks for the response - that does work for me.
    Initially, I thought solutions like self-join would blow up the data/computation too much, but it works fine with the data I have.
    Thank you!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file