Fill in a specific column of Python Pandas data frames with two given series

Asked 2 years ago, Updated 2 years ago, 59 views

There are two data frames and two series. I'd like to fill the amount column in df2 with the value I want. Based on the fruit id, the "amount" is the first priority, and if the fruit id is blank, the average amount of origin is added. It's easier said than done.

python machine-learning pandas

2022-09-20 13:14

2 Answers

I thought it would be fun, so I did it.

In Pandas, there is a series, a data frame, and an index, which can be convenient when you think of it as the same concept as the dictionary. That's an example of this question. If you simply set df[column] = series, it works like dict.update.

Please take a closer look at the example code and understand it.

import numpy as np
import pandas as pd

df1 = pd.DataFrame(
    {
        "Fruit name": ["Apple", "Tomato", "Banana", "Grape"],
        "Fruitid": [0, 1, 2, 3],
        "Amount": [600, 700, 300, 400],
        "Country of origin": [Seoul", "Seoul", "Jeju", "Daejeon",
        "Average amount of origin": [800, 800, 200, 500],
    }
)
df2 = pd.DataFrame(
    {
        "Fruit name": ["Apple", "Banana", "Geobong"],
        "Fruitid": [0, 2, 4],
        "Country of origin": [Seoul", "Jeju", "Daejeon",
        "Amount": [np.NaN, np.NaN, np.NaN],
        "Required Value": [600, 300, 500],
    }
)
s1 = pd.Series ([600, 700, 300, 400], index=["Apple", "Tomato", "Banana", "Grape"])
s2 = pd.Series ([800, 200, 500], index=["Seoul", "Jeju", "Daejeon"])

colums_org = df2.columns

df2 = df2.set_index ("fruit name")
df2["amount"] = s1
df2 = df2.reset_index()

print(df2.to_markdown())
"""
|    | fruit name | fruit id | country of origin | amount | desired value |
|---:|:---------|---------:|:---------|-------:|-----------:|
|  0 | Apple | 0 | Seoul | 600 | 600 |
|  1 | Banana | 2 | Jeju | 300 | 300 |
|  2 | Geobong | 4 | Daejeon | nan | 500 |
"""

df2 = df2.set_index ("origin")
df2.loc[df2["amount"].isna(), "amount"] = s2
df2 = df2.reset_index()

print(df2.to_markdown())
"""
|    | country of origin | fruit name | fruit id | amount | desired value |
|---:|:---------|:---------|---------:|-------:|-----------:|
|  0 | Seoul | Apple | 0 | 600 | 600 |
|  1 | Jeju | Banana | 2 | 300 | 300 |
|  2 | Daejeon | Geobong | 4 | 500 | 500 |
"""

df2 = df2[colums_org]
print(df2.to_markdown())
"""
|    | fruit name | fruit id | country of origin | amount | desired value |
|---:|:---------|---------:|:---------|-------:|-----------:|
|  0 | Apple | 0 | Seoul | 600 | 600 |
|  1 | Banana | 2 | Jeju | 300 | 300 |
|  2 | Geobong | 4 | Daejeon | 500 | 500 |
"""


2022-09-20 13:14

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.