Understanding NStepBiLSTM Dropout for Chainers

Asked 2 years ago, Updated 2 years ago, 78 views

I think the NStepBiLSTM of the chain has a dropout rate argument.

0 → Do not drop out at all
1 → All become 0
Will it be?

When I set dropout=0, it didn't work well, so I thought it might be the other way around, so I asked.

Below is the code dropout=args.dropout.

L.NStepBiLSTM(n_layers=args.layers, in_size=100,\
                  out_size=args.hidden_size, dropout=args.dropout)

Thank you for your cooperation.

python3 chainer

2022-09-30 21:32

1 Answers

https://github.com/chainer/chainer/blob/master/chainer/functions/noise/dropout.py

The ratio, which is an argument for the dropout function on the github source code of the above chain, is
The default value is 0.5, so even if you think 50% is left or 50% is left,
Because there are no contradictions, I often forget which one I meant...

The dropout_ratio, the constructor argument for the Dropout class on the source code, is
It's easy to remember by name if you remember that it's the percentage of abandonment.
(And the ratio of the dropout function is passed directly to the dropout_ratio of the dropout class.)

In fact, there are two lines:

scale=x[0].dtype.type(1./(1-self.dropout_ratio))
flag = numpy.random.rand(*x[0].shape)>=self.dropout_ratio

You can read from either line that it is the percentage to drop.
(Can't you read it? No, I can read it if I include the front and back...?)

https://github.com/chainer/chainer/blob/master/chainer/functions/connection/n_step_rnn.py

The NStepBiLSTM constructor argument dropout follows
Constructor argument in NStepRNNBase and

on the n_step_rnn function on the github source code of the above chain passed directly to the dropout_ratio argument and
Beyond that, it is passed directly to the dropout function mentioned above.

I just walked by, and it seems like you asked me a question two months ago, but
No one else answered, so I replied.

When I set dropout=0, it didn't work well, so I thought it might be the other way around, so I asked.

On the other hand, if dropout=1, you will throw everything away, and it should not move.

The dropout rate is usually 0.5 but near the entrance, it is 0.2 or
In some deep layers, it can be around 0.7.
(However, in the case of n_step_rnn series, there seems to be only one setting...)

I think 0.5 is better if you set it in a safe place without looking at anything, but
I think it would be good to change many things.


2022-09-30 21:32

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.