I want to find the id when the maximum value is the minimum value.

Asked 1 years ago, Updated 1 years ago, 280 views

実行 Execution environment
Windows 10 Python 3.X
pandas

This is the continuation of the question on this link.
Pandas cannot retrieve data under certain conditions.
リンクLinked questions
In the link above, we were able to use pandas to ask which "classification" each id belongs to in the column, and we were able to use groupby to get the classification and numbers for each id.

質問Contents of questions
I was able to get the following dfx from the above question:

ddfx (csv format for easy separation)

 id, numeric, classification
aaa, 3141, type 2
bbb, 5926, type 1
ccc,5358, type 3
ddd,9793,type1
eee, 2384, type 3
fff, 6264, type 2
ggg, 3383, type 2
hhh,2795,type1
iii, 288, type 3
jjj, 4197, type 1
kkk,1693,type3
lll,9937,type2
mmm, 5105, type 2
nnn, 8209, type 1

We executed the following code in order to obtain the maximum and minimum values for each of the three elements of the column Classification from this data.

dfx_max_min = dfx.groupby('classification') .agg(['max','min'])
print(dfx_max_min)

When I checked the print results, I was able to get the following data.
We got the maximum and minimum values for each classification, both id and number.

id number
       max min max min
classification
type1 nnn hhh9793 2795
type 2 mmmaaa99373141
type3kkkcc5358288

This time, I would like to recreate the maximum and minimum id as df.
The maximum value of type 1 id is ddd because the id is ddd when the maximum value of type 1 is 9793.

成形 Data you want to mold

id number
       max min max min
classification
type1dd bbb97932795
type2lllaaa99373141
type3ccciii5358288

How can I write it in Pandas if I want to perform data molding like this time?

python pandas

2023-01-16 06:19

2 Answers


Slightly
Above all, it looks disgraceful (personally), so I will rewrite it when I get a better one.

deffn(sdf):
    smax=dfx.loc [dfx['numerical']==sdf.max()['numerical'], ['id', 'numerical']].iloc[0]
    smin=dfx.loc [dfx['numerical']==sdf.min()['numerical'], ['id', 'numerical']].iloc[0]
    df = pd.concat ([smax,min])
    df.index=pd.MultiIndex.from_tuples([('max', 'id', ('max', 'numerical'), ('min', 'id', ('min', 'numerical')])
    return df

dfx.groupby('classification').apply(fn)
#       max min
#       id number id number
# classification                
# type1ddd9793 hhh2795
# type2ll9937aaa3141
# type3cc5358iii288


2023-01-16 15:16

 midx=pd.MultiIndex.from_product([['max','min', dfx.columns[:2]]])
dfx_max_min = dfx.groupby('classification')['numerical'].agg(['idxmax', 'idxmin'])\
                 .apply(lambdax:dfx.loc [x,dfx.columns[:2]].stack().set_axis(midx),axis=1)\
                 .swaplevel(axis=1).sort_index(axis=1, level=0)

print(dfx_max_min)

#         id number      
#        max min max min
# classification                       
# type1dd hhh97932795
# type2lllaaa99373141
# type3ccciii5358288


2023-01-16 16:02

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.