Questions related to grid search and cross validation of python's xgboost.

Asked 2 years ago, Updated 2 years ago, 20 views

Python: Try XGBoost

This is the question in the "Visualize the Learning Process" section of the page above.

With the above HP sample code

#Dictionary for recording learning processes
events_result={}
print('@A'.evals_result)
bst = xgb.train(xgb_params,
                dtrain,
                num_boost_round = 1000, # keep increasing the number of rounds
                events = events,
                events_result=evals_result,
                )

print('@B'.evals_result)

@ If you print in the A part, the results are naturally colored.
However, if you print at @B, there are many things that have been substituted for the events_result.

What is substituted for the events_result?

I'm trying to follow the code line by line because I don't understand why HP's "#Plot Learning Course as Line Graph" and later parts are printed.

First of all, I would like you to explain about the events_result.
Thank you for your cooperation.

python

2022-09-30 19:54

2 Answers

I didn't substitute anything for the events_result, but somehow something was substitute at @B. First of all, I'm not sure about that.

For evals_result where nothing was substituted (which was an empty dictionary), elements (lost learning and validation data) were added while xgb.train() was being processed.

The xgb.train also appears to be substituting the values_result of the color with the values_result argument of the .train.

Yes, that's right. The argument passes the value to evals_result (add elements) during xgb.train() processing."Maybe they don't understand ""reference delivery""."Try using this keyword.

Also, if xgb.train substituted something for the events_result, then the question arises again: why do you need events_result={}?

Undefined variables are not allowed.

Also, I don't know what the dictionary for recording the learning process refers to. Sorry.

If you look at the page you are referring to, you are plotting the loss of learning and validation data retrieved from this dictionary.If you go over 100 rounds, you can read how accuracy doesn't improve even if you're learning.

It's like this. Where and what was substituted for the events_result?I don't really understand that.

As I mentioned earlier.

The reason why evals_result is not the return value of xgb.train() is probably to be consistent with other similar methods (*I guess...).


2022-09-30 19:54

@Kohei TAMURA-san As a supplement to your answer (Snakefoot?) and a way of thinking for future learning, please try to find out what API you are using and what specifications you are using.

The API specifications are listed below.
xgboost.train(params,dtrain,num_boot_round=10,evals=(),obj=None,feval=None,maximize=False,early_stopping_rounds=None,evals_reverse,=Nose=None,evals_reason=

events_result(dict) –
This dictionary stores the evaluation results of all the items in watchlist.
Example: with a watchlist containing [(dtest, 'eval', (dtrain, 'train')] and a parameter containing ('eval_metric': 'logloss'), the events_result returns

In other words, for the question evals_result, what is substituted? the results are stored by the train() method.

Combining the above description, the reference to the question, and the following articles, watchlist is a list of DMatrix and named tuples specified in the evals parameter.
Access train and evaluation error in xgboost

For questions about Why do I need events_result={}? and What is a dictionary for recording learning processes? in the comments, one way to save your intermediate results, and the API specification early_stopping_rounds parameter, Therefore, instead of notifying you as a new result for each call, it is possible that it is passed by reference so that you can use it around.

If you're guessing that it does, evals_result={} will be setting the cleanest state.


2022-09-30 19:54

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.