I want to add new columns using different rows of data in Pandas.

Asked 2 years ago, Updated 2 years ago, 36 views

For DataFrame that you have already created, you would like to include the appropriate values under certain conditions in the new column that you are adding.
This time, I brought the data of the game tournament, and the method of the game's match is
First set: 2 people vs 2 people battle (win 2 first)
2nd set: 1 person vs 1 person battle (2 wins first)
3rd set: 1 person vs 1 person battle (If you destroy the opponent with 3 people remaining, you win)

It says 2 sets first to win the game.

The data frame df already has columns like match, set, game, game winner, team1 and team2 in the data frame.

↓Current data frame

print(df) 

  # match #set #game #gamewinner #team1 #team2
#1 1 1 1 1 A B
# 2 1 2 1 A B    
# 3 1 2 1 2 A B    
# 4 1 2 2 1 A B
# 5 1 2 3 2 A B
# 6 1 3 1 A B
# 7 1 3 2 A B
# 8 1 3 3 1 AB
# 9 1 3 4 1 AB
# 10 2 1 1 2 CD

  ...continued 

I would like to add a new column called setwinner, matchwinner to this data frame so that one line can tell who won the whole set/match.

↓Image you want to do

print(df) 

  # match #set #game #gamewinner #team1 #team2 #setwinner #matchwinner
#1 1 1 1 1 AB 11
# 2 1 2 1 AB 11   
# 3 1 2 1 2 AB 21    
# 4 1 2 2 1 AB 21
# 5 1 2 3 2 AB 21
# 6 1 3 1 1 AB 11
# 7 1 3 2 2 AB 11
# 8 1 3 3 1 AB 11
# 9 1 3 4 1 AB 11
# 10 2 1 1 2 CD 21

I have to use the information in the front and back lines to do it, so I don't know how to implement what I want to do.

I was able to implement what I could do without looking at other lines of information, but I didn't understand other cases.

#If you get tangled up to 3 games in the 1st and 2nd sets, the winner of the 3rd game is the winner of the set.
df.loc [(df["game"]==3)&(df["set"]!=3), "setwinner"]=df["gamewinner"]

I look forward to hearing from you.

https://teratail.com/questions/192733 I asked the same question on this site, but there was no answer, so I am writing to you because I think some of you might understand it on this bulletin board.

python pandas

2022-09-29 21:58

1 Answers

After adding a column representing the winning team name for each game, you can use groupby and max to get the most winning team names.

defgetTeam(row):
    if row ["gamewinner"] == 1:
        return row ["team1"]
    return row ["team2"]

df["gamewinnerteam"] = df.apply(getTeam,axis=1)
setwinner=df.groupby(["match", "set"]).gamewinnerteam.max()
matchwinner=setwinner.groupby("match").max()

You can do the rest by substituting the setwinner, matchwinner for each line.

# Example
df["setwinnerteam"] = df.apply (lambda row:setwinner [row["match"], row["set"]], axis=1)

However, setwinner, matchwinner may be sufficient because storing them in each line is redundant for some purposes.


2022-09-29 21:58

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.