I would like to extract DataFrame.
A1 B1 C1
A1 B1 C1
A2 B2 C2
A2B2C3
A3B3C4
A3B3C4
A3B3C5
I would like to leave all three columns of data like the one above and remove the duplicates.
(I don't want to leave column 0 A1, but I want to leave A3)
A2 B2 C2 A2B2C3 A3B3C4 A3B3C4 A3B3C5
However,
This is what df.duplicated() looks like.
A2 B2 C2 A2B2C3 A3B3C5
If df.duplicated (keep = 'last') this is what happens.
A1 B1 C1
A2 B2 C2
A2B2C3
A3B3C4
A3B3C5
How should I write it?
python
First of all, I decided to use groupby()
to make a decision separately for each Ax
I don't think about speed or sophistication.
import pandas as pd
data = [
['A1', 'B1', 'C1',]
['A1', 'B1', 'C1',]
['A2', 'B2', 'C2',]
['A2', 'B2', 'C3',]
['A3', 'B3', 'C4',]
['A3', 'B3', 'C4',]
["A3", "B3", "C5"]
]
df = pd.DataFrame(data)
df2 = pd.DataFrame()
for i,gindf.groupby(0):
if not any(g.duplicated()) :## No duplication
df2 = pd.concat ([df2,g], ignore_index = True)
## There are duplicates, but there are other values.
elif(g.duplicated().value_counts(sort=False)[0]>1):
df2 = pd.concat ([df2,g], ignore_index = True)
print(df2)
708 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
542 Unable to install versioned in Google Colab
547 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
549 PHP ssh2_scp_send fails to send files as intended
547 Who developed the "avformat-59.dll" that comes with FFmpeg?
© 2024 OneMinuteCode. All rights reserved.