We're Moving!

The Vertica Forum is moving to a new OpenText Analytics Database (Vertica) Community.

Join us there to post discussion topics, learn about

product releases, share tips, access the blog, and much more.

Create My New Community Account Now


Time slots to dates — Vertica Forum

Time slots to dates

pestridpestrid Vertica Customer
edited August 2021 in General Discussion

Hi

I am working in the Healthcare sector. We have a Vertica Analytics database containing a lot of data. One thing is Capacity Management. Here we have beds with an ID, Status, a start and stop time for that status.

Now I want to aggregate the time each bed had the different statuses by Bed (ID), Status and Date.
The start and stop is timestamp and I can use TIMESTAMPDIFF to calculate the time between the start and stop time.

But how can I aggregate to see each Bed, Status and the number of seconds (or minutes) . If a Bed was occupied for several Days we can have a start time in the afternoon Day 1 and an end time in the morning several Days later. I want to see how much time belongs to each date in that intervall.

Thanks!
/P

Tagged:

Best Answer

  • Jim_KnicelyJim_Knicely - Select Field - Administrator
    edited September 2021 Answer ✓

    There's always a ton of ways to get to the same result. But, if I understand the initial request in the thread, is the result set below along the lines of what you want?

    verticademos=> SELECT * FROM beds;
     id | status |         start_time         |          end_time
    ----+--------+----------------------------+----------------------------
      1 |      1 | 2021-08-31 15:33:27.815366 | 2021-09-02 15:33:27.815366
      2 |      1 | 2021-08-31 15:34:06.685608 | 2021-08-31 16:34:06.685608
      2 |      2 | 2021-08-31 17:33:27.826684 | 2021-09-05 15:33:27.826684
    (3 rows)
    
    verticademos=> SELECT id, status, start_time, end_time, day,
    verticademos->        CASE
    verticademos->          WHEN LAG(day) OVER (PARTITION BY id, status ORDER BY start_time) IS NULL THEN datediff('second', start_time, day)
    verticademos->          WHEN LEAD(day) OVER (PARTITION BY id, status ORDER BY start_time) IS NULL THEN datediff('second', day, end_time)
    verticademos->          ELSE 86400
    verticademos->        END seconds
    verticademos->   FROM beds
    verticademos->  CROSS JOIN (SELECT day FROM ((SELECT MIN(start_time) tm FROM beds) UNION ALL (SELECT MAX(end_time) tm FROM beds)) foo TIMESERIES day AS '1 DAY' OVER(ORDER BY tm)) foo
    verticademos->  WHERE foo.day BETWEEN start_time AND end_time
    verticademos->  ORDER BY id, status, start_time;
     id | status |         start_time         |          end_time          |         day         | seconds
    ----+--------+----------------------------+----------------------------+---------------------+---------
      1 |      1 | 2021-08-31 15:33:27.815366 | 2021-09-02 15:33:27.815366 | 2021-09-01 00:00:00 |   30393
      1 |      1 | 2021-08-31 15:33:27.815366 | 2021-09-02 15:33:27.815366 | 2021-09-02 00:00:00 |   56007
      2 |      2 | 2021-08-31 17:33:27.826684 | 2021-09-05 15:33:27.826684 | 2021-09-01 00:00:00 |   23193
      2 |      2 | 2021-08-31 17:33:27.826684 | 2021-09-05 15:33:27.826684 | 2021-09-02 00:00:00 |   86400
      2 |      2 | 2021-08-31 17:33:27.826684 | 2021-09-05 15:33:27.826684 | 2021-09-03 00:00:00 |   86400
      2 |      2 | 2021-08-31 17:33:27.826684 | 2021-09-05 15:33:27.826684 | 2021-09-04 00:00:00 |   86400
      2 |      2 | 2021-08-31 17:33:27.826684 | 2021-09-05 15:33:27.826684 | 2021-09-05 00:00:00 |   56007
    (7 rows)
    
    

    Better formatting in image:

Answers

  • Can you post a few sample records of your data, and what you expect to see as an output?

  • pestridpestrid Vertica Customer

    Sure!

    I tried to add the image to my post, but it turned up jus as strange text... Now I think it should be viewable as attached.
    To make the input even easier I could say that the only fields needed to solve my issue is an ID column (location_id), the status and the start_dttm.
    Since we can assume that the start and end times touch each other, the end_dttm can be captured as the LAG of the start_dttm over each location_id.
    The time should always be 86400 seconds per day except for the first day, if a location_id was added during that day. Last row per location_id can be ended with yesterdays date since we will always load data at night and be one day behind. No need for real time.

    The result I want is something like

    Date, Location_ID, Status, Seconds

    Thanks!
    /P

  • pestridpestrid Vertica Customer

    Absolutely amazing! :)

    I had a feeling that a TIMESERIES function would go into play. I never used those in Vertica and it's well over 20 years since I used one in SAS...
    It looks great! I will spend an hour or two to understand the TIMESERIES function and your CASE statement, so that I can reuse without saying like an old collegue, "I don't know why I write this code, but it works".

    Thanks!

  • pestridpestrid Vertica Customer

    Adjustments? What happened to the 31st of August?

    /P

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file