Why is there a limit to the accuracy of reproduction even if we increase the number of intermediate layer nodes in Autoencoder?

I tried to increase the reproduction accuracy of the output by increasing the number of intermediate layer nodes in Autoencoder, but it didn't work, so I would like to ask you a question.

I understand that Autoencoder, for example, has a structure of 784->32->784 that provides feature information compressed into 32 dimensions.

So I thought that if the 784->32->784 Autoencoder reproduces 32 to 784 dimensions of information, the 784->784->784 network would be better, or rather, I would create an equal mapping to restore input almost without loss, but the output was not as clear as I expected.(Original MNIST is pretty clear)

Of course, I understand that there is little point in using Autoencoder unless you set up fewer nodes in the middle layer than in the input/output, but I would like to know why increasing the number of nodes in the middle layer cannot accurately learn equal mapping.

As in the above example of Fashion-MNIST, even if you use a network that does not reduce dimensions, it is hard to say that you can leave information well (break it?), so if you use it to reduce dimensions, it will only leave features that are not accurate?

In principle, I think 784 dimensions or higher is enough to reproduce 784 dimensions of information, but I would appreciate it if you could give me an answer if you know why it is not accurate, better implementation, and reference materials.

Additional note: Add the full code (Python 3&Keras) used to generate the above image.

from keras.datasets import fashion_mnist
from keras.layers import Input, Dense
from keras.models import Model

import matplotlib.pyplot asplt

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train=x_train.reshape (60000,784)
x_test=x_test.reshape (10000,784)
x_train=x_train.astype("float32")
x_test=x_test.astype("float32")
x_train / = 255.0
x_test / = 255.0

input_img=Input(shape=(784,))
encoded = Dense (784, activation = "relu") (input_img)
decoded=Dense(784, activation="sigmoid")(encoded)

autoencoder=Model (input=input_img, output=decoded)
autoencoder.compile(optimizer="adam", loss="binary_crossentropy")
autoencoder.fit(x_train, x_train,
                epochs = 5,
                batch_size = 128,
                shuffle = True,
                validation_data=(x_test,x_test))

# Convert test image
decoded_imgs=autoencoder.predict(x_test)

# How many should I display?
n = 10
plt.figure(figsize=(20,4))
for i in range(n):
    # View original test image
    ax=plt.subplot(2,n,i+1)
    plt.imshow(x_test[i].reshape(28,28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # View converted images
    ax=plt.subplot(2,n,i+1+n)
    plt.imshow(decoded_imgs[i].reshape(28,28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

python machine-learning

2022-09-30 17:46

1 Answers

Sorry, I solved myself.It seems that we simply set up a loss function that is not suitable for real-valued image data.

If I reset the loss function to the appropriate one (msle, etc.), and turned the epoch sufficiently (at least about 50), I could reproduce it with high accuracy.

2022-09-30 17:46

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656