def show_df (column, new_columns = ['distinguished'], dataframe = one):
df = dataframe[column].value_counts().reset_index()
new_names = new_colnames
new_names.append('counts')
df.columns = new_names
df['weight'] = df['count'].apply(lambda x : round(x/df['count'].sum(),2)*100)
return df
I wanted to check the number of counts and the proportion of counts in the DataFrame while conducting the EDA
The function was created successfully.
We estimated that the separator would be fixed for each variable, so we specified a default value.
However, in the process of proceeding with the append within the function,
'Count count' is constantly added to colnames
colnames = ['Distinguish', 'Counts', 'Counts'...An error occurs in the form of
.
We solved it by writing a function in a different way.
I was curious about the cause and solution of the error, so I posted a question.
I thought the reason for my estimation was that the value was not initialized within the function.
So we tried to initialize it in del new_colnames
and new_colnames=[]
ways, but it didn't work.
I would appreciate it if you could tell me why this situation is happening.
python user-defined-function
Because the new_colnames default factor is list type, it seems to remember the result of the previous call like the static variable in c.
And, there's a factor called normalize in the value_counts, so the specific gravity is simple.
>>> counts = pd.concat([df.num.value_counts(), df.num.value_counts(normalize=True)], axis=1)
>>> counts.columns = [ 'counts', 'counts_norm' ]
>>> counts
counts counts_norm
9 10 0.20
4 7 0.14
1 7 0.14
7 6 0.12
5 5 0.10
3 5 0.10
8 4 0.08
6 4 0.08
2 2 0.04
© 2024 OneMinuteCode. All rights reserved.