In Python, I would like to ask you how to count the frequency of words by group if the data are organized as shown in the table below.
After integrating the contents by category, remove duplicates and
A: apple, orange, banana, melon (4 pieces)
B: peach, apple, orange (3 pieces)
I'd like to print these results.
Is there a way to write the groupby and value_count of Pandas at once?
Or is there any other way to do it?
pandas group-by count python
>>> df = pd.DataFrame({"category":["A", "A", "B"], "content":["apple, orange", "banana, apple, melon", "peach, apple, orange" ]})
>>> df
category content
0 A apple, orange
1 A banana, apple, melon
2 B peach, apple, orange
>>> df["content"] = df["content"].str.split(", ")
>>> df
category content
0 A [apple, orange]
1 A [banana, apple, melon]
2 B [peach, apple, orange]
>>> df["content"] = df["content"].apply(set)
>>> df
category content
0 A {orange, apple}
1 A {melon, banana, apple}
2 B {peach, orange, apple}
>>> df.groupby("category").apply(lambda x: set.union(*x.content))
category
A {melon, banana, orange, apple}
B {peach, orange, apple}
dtype: object
>>> group_df = df.groupby("category").apply(lambda x: set.union(*x.content))
>>> for tup in group_df.to_frame().itertuples():
tup1_str = ", ".join(sorted(tup[1]))
n = len(tup[1])
print(f"{tup[0]}: {tup1_str} ({n} pieces)")
A: apple, banana, melon, orange (4 pieces)
B: apple, orange, peach (3 pieces)
© 2024 OneMinuteCode. All rights reserved.