Is Keras validation_split cross-validation?

Asked 2 years ago, Updated 2 years ago, 381 views

I'm studying Python on my own.

With Keras code,

model.fit(x_train,y_train,verbose=1,validation_split=0.2,shuffle=True,epochs=20000)

At that time, we understand that learning occurs over 20,000 epochs, with 80% of the data being used for learning and 20% for validation.

shuffle=True

Therefore, the selection of validation data becomes random, and I think this code will be cross-validation, or k=5 divided cross-validation. Is that correct?
When I looked into Keras' code for k-division cross verification, I was worried about whether it was correct or not, so I asked you a question.

I apologize for the rudimentary question, but I would appreciate it if you could let me know.Thank you for your cooperation.

python deep-learning keras

2022-09-30 21:58

1 Answers

Not cross-validation.

validation_split is an argument to truncate the specified percentage of data for validation before shuffling the data set.Also, validation data takes a specified percentage from the end of the data set.Therefore, all epochs use the same data for validation, and the extraction method is not random.

12

shuffle is an argument that shuffles a data set for learning.These arguments are provided because 3,4 are known to reduce loss by shuffling learning data per epoch.

3,4

Keras documentation describes the arguments for model.fit as follows:

  • validation_split: floating point number between 0 and 1. percentage of training data used as validation data. [omitted] This validation data is sampled from the back of the given x and y data before shuffle.
  • shuffle—Truth value (whether to shuffle training data before each attempt [epoc]) or string (for 'batch').[abbreviated]

Keras documentation (https://keras.io/ja/models/model/), last accessed date: 2022-01-16.Square brackets and highlights are appended by respondents.

Keras does not provide K-split cross-validationAs you pointed out, you need to use another library, such as scikit-learn.

1The relevant implementation is the definition of model.fit (GitHub).validation_split as the argument is performed before applying shuffle. 2Implementation is similar to validation_split (GitHub).
3Examples of experiments with or without shuffles: https://qiita.com/hikobotch/items/d8ff5bebcf70083de089
4Description of why the loss is reduced: https://stats.stackexchange.com/a/311318

1The relevant implementation is defined by model.fit (GitHub). You can see that validation_split was used as an argument before applying shuffle.2Implementation is similar to validation_split (GitHub).3Examples of experiments with or without shuffles: https://qiita.com/hikobotch/items/d8ff5bebcf70083de0894Description of why the loss is reduced: https://stats.stackexchange.com/a/311318


2022-09-30 21:58

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.