rekfs_skd_l[5]
STD STA TYP
FLT
171 08:30 11:30 B738
172 12:40 17:40 B738
172 18:20 19:35 B738
211 08:40 10:25 B738
212 11:25 13:20 B738
.. ... ... ...
594 17:55 18:45 B738
595 19:20 20:25 B738
761 22:00 17:05 B772
791 04:20 08:20 J7Q
792 12:40 20:10 J7Q
[118 rows x 3 columns]
rost_skd_l[5]
STD STA TYP
FLT
171 08:00 11:30 B738
172 12:40 17:40 B738
172 18:20 19:35 B738
211 08:40 10:25 B738
212 11:25 13:20 B738
.. ... ... ...
594 17:55 18:45 B738
595 19:20 20:25 B738
761 22:00 17:05 B772
791 04:20 08:20 B772
792 12:40 20:10 B772
[126 rows x 3 columns]
pd.merge(rekfs_skd_l[5], rost_skd_l[5], left_index=True, right_index=True, how='outer')
STD_x STA_x TYP_x STD_y STA_y TYP_y
FLT
171 08:30 11:30 B738 08:00 11:30 B738
172 12:40 17:40 B738 12:40 17:40 B738
*172 12:40 17:40 B738 18:20 19:35 B738
172 18:20 19:35 B738 *12:40 17:40 B738
*172 18:20 19:35 B738 *18:20 19:35 B738
.. ... ... ... ... ... ...
594 17:55 18:45 B738 17:55 18:45 B738
595 19:20 20:25 B738 19:20 20:25 B738
761 22:00 17:05 B772 22:00 17:05 B772
791 04:20 08:20 J7Q 04:20 08:20 B772
792 12:40 20:10 J7Q 12:40 20:10 B772
[128 rows x 6 columns]
I want to compare the differences between the two data frames.(Time, type, FLT with only one of the two data frames, etc.) You want to combine the two data frames to see the difference, but you get duplicate data. How do I do it without duplication?
The overlapping parts are marked with * above.
# Duplicate parts
172 12:40 17:40 B738
172 18:20 19:35 B738
In both data frames, the index FLT has a value of 172.
It's based on the index, so when you put index 172, you have to make all the combinations of two, so you get 2x2.
Like this.
If you want to make just 172a+172c, 172b+172d of this, before you put it together, you need to create another index that can tell the row 172 that there are two, and merge it based on that index.
© 2024 OneMinuteCode. All rights reserved.