When you create DataFrame using a for loop in Python, all the different variables have the same result.

Questions about Python and Pandas' programs:

The following six variables are series with index datetimeindex with different periods and data.
Differences between 1 and 3 vary in duration, and a and b differ in data.

season_a1
season_b1

season_a2
season_b2

season_a3
season_b3

 from pandas import DataFrame
import numpy as np

# DataFrame containing results
result_df1 = DataFrame (index=np.range(0,24))
result_df2 = DataFrame (index=np.range(0,24))
result_df3 = DataFrame (index=np.range(0,24))


for year in range (2000, 2004):

    if str(year) in season_b1.index:
        a1 = season_a1 [str(year)].index.hour.value_counts().sort_index()
        b1 = season_b1 [str(year)].index.hour.value_counts().sort_index()
        result_df1['A'+str(year)] = a1
        result_df1['B'+str(year)] = b1
# 1

    if str(year) in season_b2.index:
        a2 = season_a2 [str(year)].index.hour.value_counts().sort_index()
        b2 = season_b2 [str(year)].index.hour.value_counts().sort_index()
        result_df2 ['A'+str(year)] = a2
        result_df2 ['B'+str(year)] = b2
# 2

    if str(year) in season_b3.index:
        a3 = season_a3 [str(year)].index.hour.value_counts().sort_index()
        b3 = season_b3 [str(year)].index.hour.value_counts().sort_index()
        result_df3['A'+str(year)] = a3
        result_df3['B'+str(year)] = b3
# 3

result_df1.to_csv(path1)
result_df2.to_csv(path2)
result_df3.to_csv(path3)

Simply using value_counts() aggregated results

a1,b1 to result_df1.
a2,b2 to result_df2.
a3,b3 to result_df3.

Just add it as a column and print it as a csv file.

As a result, I would like all the results of result_df1-3 to be different, but
Each has a different original data and variable name, but
All output results will be the same as result_df3.

As a confirmation, I tried printing() the contents of result_df1-3 in parts #1 to #3.
In the first loop #2, result_df1 is already the same as result_df2
# For 3, result_df1, result_df2 are the same as result_df3.

I think it's an elementary mistake, but I can't solve it.
I'm sorry, but I appreciate your cooperation.

python python3 pandas numpy

2022-09-30 17:24

1 Answers

As you can see in the comments, it is difficult to give an accurate answer without a minimum code that can be reproduced, but for example, in the following parts,

result_df1=DataFrame(index=np.range(0,24))
result_df2 = DataFrame (index=np.range(0,24))
result_df3 = DataFrame (index=np.range(0,24))

I think the same phenomenon will occur if the following occurs:

result_df1=DataFrame(index=np.range(0,24))
result_df2 = result_df1
result_df3 = result_df1

If you do this during initialization, Python substitutes the reference, so result_df will all have the same value.Therefore, all values are the last substituted result_df3.

2022-09-30 17:24

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656