I would like to ask you a question because there are some parts that cannot be solved by data preprocessing.
One of the two tables is TV program viewing data based on user ID (0: I didn't see it 1: I saw it),
The second is the CM submission data based on the ID of the TV program above, and the product name and company name are dummy.
What I'm trying to do this time is to aggregate how many commercials users have seen for each product.
I have an image of the output of product A being 回 times and product B being 回 times, but I cannot help but
I don't know how to do it, so I would appreciate it if you could let me know.
Sample TV program viewing data
https://drive.google.com/file/d/1ySTpfMASAqerC-vWrMX5oAmdT3DxBHno/view?usp=sharing
Sample TV commercial submission data
https://drive.google.com/file/d/1qxGqaUQrtrMpTkMptb6slyq22fNyfvRO/view?usp=sharing
Code currently running
import pandas as pd
import numpy as np
import matplotlib.pyplot asplt
tv = pd.read_csv('sample_tv.csv')
tv
tvcm=pd.read_csv('sample_tvcm.csv', encoding='cp932')
tvcm
Results of execution
Thank you.I ran it with the code you gave me.
df=(tvcm.groupby('item_name')
.agg({
'title_code_variable': lambdax:tv[x].sum()
})
.applymap(lambdax:x.sum())
.reset_index()
.rename(columns={
'item_name': 'ProductName',
'title_code_variable': 'View Count'
}))
ValueError: Must produce aggregated value
Is there any problem on our side?
python pandas
Group by product name and take sum.
import pandas as pd
import numpy as np
tv = pd.read_csv('sample_tv.csv')
tvcm=pd.read_csv('sample_tvcm.csv', encoding='cp932')
df=(tvcm.groupby('item_name')
.agg({
'title_code_variable': lambdax:tv[x].sum()
})
.applymap(lambdax:x.sum())
.reset_index()
.rename(columns={
'item_name': 'ProductName',
'title_code_variable': 'View Count'
}))
print(df.to_markdown(index=False))
© 2025 OneMinuteCode. All rights reserved.