I want to compare python column with row data and aggregate it

Asked 2 years ago, Updated 2 years ago, 391 views

I would like to ask you a question because there are some parts that cannot be solved by data preprocessing.

One of the two tables is TV program viewing data based on user ID (0: I didn't see it 1: I saw it),
The second is the CM submission data based on the ID of the TV program above, and the product name and company name are dummy.

What I'm trying to do this time is to aggregate how many commercials users have seen for each product.
I have an image of the output of product A being 回 times and product B being 回 times, but I cannot help but
I don't know how to do it, so I would appreciate it if you could let me know.

Sample TV program viewing data
https://drive.google.com/file/d/1ySTpfMASAqerC-vWrMX5oAmdT3DxBHno/view?usp=sharing
Sample TV commercial submission data
https://drive.google.com/file/d/1qxGqaUQrtrMpTkMptb6slyq22fNyfvRO/view?usp=sharing

Code currently running

import pandas as pd
import numpy as np
import matplotlib.pyplot asplt

tv = pd.read_csv('sample_tv.csv')
tv

tvcm=pd.read_csv('sample_tvcm.csv', encoding='cp932')
tvcm

Results of execution

Thank you.I ran it with the code you gave me.           

df=(tvcm.groupby('item_name')
              .agg({
                'title_code_variable': lambdax:tv[x].sum()
              })
              .applymap(lambdax:x.sum())
              .reset_index()
              .rename(columns={
                'item_name': 'ProductName',
                'title_code_variable': 'View Count'
          }))

after running
ValueError: Must produce aggregated value

Is there any problem on our side?

python pandas

2022-09-30 22:00

1 Answers

Group by product name and take sum.

import pandas as pd
import numpy as np

tv = pd.read_csv('sample_tv.csv')
tvcm=pd.read_csv('sample_tvcm.csv', encoding='cp932')

df=(tvcm.groupby('item_name')
          .agg({
            'title_code_variable': lambdax:tv[x].sum()
          })
          .applymap(lambdax:x.sum())
          .reset_index()
          .rename(columns={
            'item_name': 'ProductName',
            'title_code_variable': 'View Count'
          }))

print(df.to_markdown(index=False))


2022-09-30 22:00

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.