Pandas binary column values according to index

Question

Pandas binary column values according to index

I currently have a DataFrame that contains the age of the population and the frequency of those ages, for example:

Age is the index of the DataFrame. I would like to do some Pandas magic so that I get the DataFrame bit like this:

           freq
 (20, 30]   308
 (30, 40]   111
 (40, 50]    85
 (50, 60]    58
 (60, 70]    63
 (70, 80]   101

Thus, the index now consists of age intervals rather than individual ages, and the frequencies are summed accordingly. How can i do this?

+3

python pandas dataframe binning

jerry maks 17 nov. 15 at 15:35

source to share

1 answer

Alex Riley · Accepted Answer · 2015-11-17T15:44:48+0000

You can use groupby

after use cut

to dump the index of the DataFrame. For example:

>>> df = pd.DataFrame({'freq': [2, 3, 5, 7, 11, 13]}, 
                      index=[22, 29, 30, 31,25, 42])

>>> df
    freq
22     2
29     3
30     5
31     7
25    11
42    13

Then:

>>> df.groupby(pd.cut(df.index, np.arange(20, 60, 10))).sum()
          freq
(20, 30]    21
(30, 40]     7
(40, 50]    13

np.arange(20, 60, 10)

defines the cells to be used; you can adjust them according to the max / min values in the "freq" column.

Pandas binary column values ​​according to index

More articles:

Pandas binary column values according to index