There are two data frames and two series. I'd like to fill the amount column in df2 with the value I want. Based on the fruit id, the "amount" is the first priority, and if the fruit id is blank, the average amount of origin is added. It's easier said than done.
python machine-learning pandas
I thought it would be fun, so I did it.
In Pandas, there is a series, a data frame, and an index, which can be convenient when you think of it as the same concept as the dictionary. That's an example of this question. If you simply set df[column] = series, it works like dict.update.
Please take a closer look at the example code and understand it.
import numpy as np
import pandas as pd
df1 = pd.DataFrame(
{
"Fruit name": ["Apple", "Tomato", "Banana", "Grape"],
"Fruitid": [0, 1, 2, 3],
"Amount": [600, 700, 300, 400],
"Country of origin": [Seoul", "Seoul", "Jeju", "Daejeon",
"Average amount of origin": [800, 800, 200, 500],
}
)
df2 = pd.DataFrame(
{
"Fruit name": ["Apple", "Banana", "Geobong"],
"Fruitid": [0, 2, 4],
"Country of origin": [Seoul", "Jeju", "Daejeon",
"Amount": [np.NaN, np.NaN, np.NaN],
"Required Value": [600, 300, 500],
}
)
s1 = pd.Series ([600, 700, 300, 400], index=["Apple", "Tomato", "Banana", "Grape"])
s2 = pd.Series ([800, 200, 500], index=["Seoul", "Jeju", "Daejeon"])
colums_org = df2.columns
df2 = df2.set_index ("fruit name")
df2["amount"] = s1
df2 = df2.reset_index()
print(df2.to_markdown())
"""
| | fruit name | fruit id | country of origin | amount | desired value |
|---:|:---------|---------:|:---------|-------:|-----------:|
| 0 | Apple | 0 | Seoul | 600 | 600 |
| 1 | Banana | 2 | Jeju | 300 | 300 |
| 2 | Geobong | 4 | Daejeon | nan | 500 |
"""
df2 = df2.set_index ("origin")
df2.loc[df2["amount"].isna(), "amount"] = s2
df2 = df2.reset_index()
print(df2.to_markdown())
"""
| | country of origin | fruit name | fruit id | amount | desired value |
|---:|:---------|:---------|---------:|-------:|-----------:|
| 0 | Seoul | Apple | 0 | 600 | 600 |
| 1 | Jeju | Banana | 2 | 300 | 300 |
| 2 | Daejeon | Geobong | 4 | 500 | 500 |
"""
df2 = df2[colums_org]
print(df2.to_markdown())
"""
| | fruit name | fruit id | country of origin | amount | desired value |
|---:|:---------|---------:|:---------|-------:|-----------:|
| 0 | Apple | 0 | Seoul | 600 | 600 |
| 1 | Banana | 2 | Jeju | 300 | 300 |
| 2 | Geobong | 4 | Daejeon | 500 | 500 |
"""
In short, the problem situation is
That's the right?
If so, I'm not familiar with the data frame, so SQL, but it's not that it's not, but it's actually a bit tricky.
select
total.item_name,
total.area_name,
ifnull(some.price, total.default_price) AS posible_price -- look this part up in magic.some and write total if not
from (
select
items.id AS item_id,
items.name As item_name,
areas.id AS area_id,
areas.name AS area_name,
areas.default_value AS default_price
From items, areas -- combine the numbers of all possible cases
) as total -- add 'already known fruit+origin information' to each possible case
left join item_area_pricings some -- this table corresponds to df1
ON some.item_id = total.item_id AND some.area_id = total.area_id -- Both conditions must be met
order by total.item_id, total.area_id;
I hope it's useful.
592 GDB gets version error when attempting to debug with the Presense SDK (IDE)
564 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
565 Who developed the "avformat-59.dll" that comes with FFmpeg?
567 Understanding How to Configure Google API Key
593 Uncaught (inpromise) Error on Electron: An object could not be cloned
© 2024 OneMinuteCode. All rights reserved.