I want to compare two data frames and return a column of corresponding values.

Asked 2 years ago, Updated 2 years ago, 37 views

Hello. I'm analyzing the customer data of the department store. I want to calculate the maximum purchase amount for each customer and then extract the maximum purchase amount, but I'm not sure how to code it. In other words, I'm curious about the item that the customer bought with the maximum purchase price. I would like to return gds_grp_mcla_nm for each cust_id corresponding to the maximum purchase EX of feature by comparing the original data with the feature data. I tried but it was not successful, so I would appreciate it if you could help me with

python pandas

2022-09-20 19:51

1 Answers

I would do this.

import pandas as pd
from io import StringIO

s = """cust_id,name,amount
11,aaa,33
11,bbb,233
11,ccc,21
22,aaa,33
22,bbb,1
22,ccc,312"""

df = pd.read_csv(StringIO(s))
print("-" * 10)
print(df.to_markdown())

df["cust_max_amount"] = df.groupby("cust_id")["amount"].transform("max")
print("-" * 10)
print(df.to_markdown())
# ----------
# # |    |   cust_id | name   |   amount |   cust_max_amount |
# |---:|----------:|:-------|---------:|------------------:|
# # |  0 |        11 | aaa    |       33 |               233 |
# # |  1 |        11 | bbb    |      233 |               233 |
# # |  2 |        11 | ccc    |       21 |               233 |
# # |  3 |        22 | aaa    |       33 |               312 |
# # |  4 |        22 | bbb    |        1 |               312 |
# # |  5 |        22 | ccc    |      312 |               312 |

df_flt = df[df["amount"] == df["cust_max_amount"]]
print("-" * 10)
print(df_flt.to_markdown())
# ----------
# # |    |   cust_id | name   |   amount |   cust_max_amount |
# |---:|----------:|:-------|---------:|------------------:|
# # |  1 |        11 | bbb    |      233 |               233 |
# # |  5 |        22 | ccc    |      312 |               312 |


2022-09-20 19:51

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.