Understanding How Python Implements the Batch Affine Layer (Error Reverse Propagation Method)

Asked 2 years ago, Updated 2 years ago, 82 views

As I had some questions regarding the implementation of Python's batch Affine layer, I have a question (reference book: Deep Learning from scratch, pp. 150-152).


Bias on the following Batch Affine Layers: I would like to know the explanation in reference books and the source code.

Be careful when adding biases.The addition of the bias in forward propagation is done for each data (first, second, ...).Therefore, for reverse propagation, the reverse propagation values of each data must be aggregated into elements of bias.

 db=np.sum(dY,axis=0)

I didn't understand why you were calculating the sum of the values (dYs) crossed.
"In the calculation graph of the error inverse propagation method, we understand that the ""+"" node passes the value from before to the lower node as it is."Therefore, I thought it would be natural to hand over the dY directly to the lower node.

Why does the batch version pass the sum of the N-line data, dY, to the lower node?
Please let me know.
I look forward to your kind cooperation.

python deep-learning

2022-09-30 16:04

1 Answers

For batches, when calculating forward propagation, b is automatically scaled and added to the other section np.dot(x,W), that is, to match the batch size.
As a result, the "lower node of the + node" is not b itself, but b is computed.

The bias term is input 1, weight b, so if you think forward propagation is multiplying it, the inverse propagation db should be the same as dW.So

self.db=np.sum(dY,axis=0)

is the same shape as dW calculation dW=np.dot(x.T,dY)

self.db=np.dot (1 is a vector of batch size, dY)

Why don't you think that

For your reference, the source code can be found at https://github.com/oreilly-japan/deep-learning-from-scratch/blob/master/common/layers.py.


2022-09-30 16:04

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.