This is a question about using Python decision tree dropna() Masters, welcome!

Asked 2 years ago, Updated 2 years ago, 69 views

When building a decision tree in Python, we need to get weighted_memory.

In the process, I don't understand the code to get the entropy value, so I'm asking you a question.

 Weighted_Entropy = np.sum([(counts[i] / np.sum(counts)) 
                                      * * entropy(data.where(data[feature] == vals[i]).dropna()[class]) \
                                       for i in range(len(vals))])

The concept that I understand when getting the entropy value in is that when feature is values[i], the entropy value is the sum of all the probabilities that each class is each class.

I think the concept I understood is right, but why did I use dropna() here, and I don't understand grammatically what it means to have a class right after it, and I don't understand it well on the code. If the concept I understood is wrong, I also welcome the fact that it is wrong.

If there's anyone who knows this well, please explain it!

python machine-learning

2022-09-20 19:46

1 Answers

In data.where(data[feature] == vals[i]), the data.where function leaves the data in data when the condition is true, and returns the data in the part where the condition is false to NaN. That's why .draopna() was used to remove the NaN portion of the total data returned by data.where(data[feature]==vals[i]). The [class] in the background has multiple columns of data returned, so only the [class] column was pointed out and only that column was brought.

You will understand if you refer to the code and the results below.

# -*- coding: utf-8 -*-

import pandas as pd

A=[[101,'a','z'],[102,'b','y'],[103,'c','x'],[104,'d','w']]
data=pd.DataFrame(A)
data.columns=['number','class','school']

print('=========')
print(data)
print('=========')
print(data.where(data['number']==101))
print('=========')
print(data.where(data['number']==101).dropna())
print('=========')
print(data.where(data['number']==101).dropna()['class'])
print('=========')


2022-09-20 19:46

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.