# computing percentiles from binned data

Greetings Vertica experts,
My tables have binned data..
Ex:
Bin_Name | Bin_Count
"56" | 20000
"57" | 23344

Is there inbuilt support to compute percentiles from this? I am looking for something analogous to numpy support around this.. https://stackoverflow.com/questions/22638557/using-numpy-percentile-on-binned-data
GIven the volume of the data, I would prefer these computations be handled by Vertica as opposed to client side processing in python.

Thanks!

Analytics Noob

• - Select Field - Administrator

Hi,

Using the logic provided here:

https://stats.stackexchange.com/questions/65710/derive-percentiles-from-binned-data

Maybe this could help?

```dbadmin=> select * from binned_data;
bin_name | bin_count
----------+-----------
1 |         2
2 |         4
3 |         2
4 |         3
(4 rows)

dbadmin=> select bin_name as percentile_30th from (select bin_name, bin_count, sum(bin_count) over(order by bin_name rows between unbounded preceding and current row) bin_count_rolling, sum(bin_count) over() bin_count_total from binned_data) foo where 0.3*bin_count_total <= bin_count_rolling order by bin_name limit 1;
percentile_30th
-----------------
2
(1 row)
```

```dbadmin=> select * from binned_data;
bin_name | bin_count
----------+-----------
56 |     20000
57 |     23344
(2 rows)

dbadmin=> select bin_name as percentile_30th from (select bin_name, bin_count, sum(bin_count) over(order by bin_name rows between unbounded preceding and current row) bin_count_rolling, sum(bin_count) over() bin_count_total from binned_data) foo where 0.3*bin_count_total <= bin_count_rolling order by bin_name limit 1;
percentile_30th
-----------------
56
(1 row)
```