Options

Performance question for a query calculating maximum people in a building

Hello everybody,

I have a query that is pretty slow and was trying to figure out how to compute the data I want.

I have a table that lists every visits of people into a building:
table visits(date_in timestamp, date_out timestamp, name varchar)

I want for each day of a month the maximum number of people into the building.
My request below:

SELECT date_slice, max(people) as measure FROM (
   SELECT TIME_SLICE(date, 86400) as date_slice, SUM(val) OVER (ORDER BY date) AS people FROM
      (  WITH subvisits AS (SELECT * from visits where date_in<'2013-12-01' and date_out > '2013-11-01')
         SELECT date_in as date, 1 as val FROM subvisits 
         UNION ALL SELECT date_out as date, -1 FROM subvisits
      ) AS in_out
   ) AS sub
GROUP BY date_slice ORDER BY date_slice;

First the subtable subvisits selects relevant data (the current month), then in_out computes a list of date with 1 for enter, -1 for exit. This data is then aggregated with the sum over date to compute the number of people inside the building. At the end, the max number for each day is computed.

The issue is that when the visits table is big, request takes very long to execute.

Is there a simpler or faster way to compute this data ?

Comments

  • Options
    Prasanta_PalPrasanta_Pal - Select Field - Employee
    Do the explain of the query
    EXPLAIN <query>;
    How big is the table visits?
  • Options
    ------------------------------
    QUERY PLAN DESCRIPTION:
    ------------------------------
    explain SELECT date_slice, max(people) as measure FROM ( SELECT TIME_SLICE(date, 86400) as date_slice, SUM(val) OVER (ORDER BY date) AS people FROM (WITH subvisits AS (SELECT * from visits where date_in<'2013-07-01' and date_out > '2013-06-01')SELECT date_in as date, 1 as val FROM subvisits UNION ALL SELECT date_out as date, -1 FROM subvisits ) AS in_out ) AS sub GROUP BY date_slice ORDER BY date_slice; Access Path:
    +-GROUPBY HASH (SORT OUTPUT) (LOCAL RESEGMENT GROUPS) [Cost: 852K, Rows: 3] (PATH ID: 2)
    | Aggregates: max(sub.people)
    | Group By: sub.date_slice
    | +---> ANALYTICAL [Cost: 838K, Rows: 34M] (PATH ID: 4)
    | | Analytic Group
    | | Functions: sum()
    | | Group Sort: in_out.date ASC NULLS LAST
    | | +---> UNION ALL [Cost: 137K, Rows: 34M] (PATH ID: 6)
    | | | +---> STORAGE ACCESS for visits [Cost: 60K, Rows: 17M] (PATH ID: 8)
    | | | | Projection: public.visits_DBD_11_rep_design_node0001
    | | | | Materialize: visits.date_in
    | | | | Filter: (visits.date_out > '2013-06-01 00:00:00+02'::timestamptz)
    | | | | Filter: (visits.date_in < '2013-07-01 00:00:00+02'::timestamptz)
    | | | +---> STORAGE ACCESS for visits [Cost: 76K, Rows: 17M] (PATH ID: 10)
    | | | | Projection: public.visits_DBD_11_rep_design_node0001
    | | | | Materialize: visits.date_out
    | | | | Filter: (visits.date_out > '2013-06-01 00:00:00+02'::timestamptz)
    | | | | Filter: (visits.date_in < '2013-07-01 00:00:00+02'::timestamptz)


    select count(*) from visits;   count 
    ----------
    16947206
    (1 row)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file