I tried to increase the reproduction accuracy of the output by increasing the number of intermediate layer nodes in Autoencoder, but it didn't work, so I would like to ask you a question.
I understand that Autoencoder, for example, has a structure of 784->32->784 that provides feature information compressed into 32 dimensions.
So I thought that if the 784->32->784 Autoencoder reproduces 32 to 784 dimensions of information, the 784->784->784 network would be better, or rather, I would create an equal mapping to restore input almost without loss, but the output was not as clear as I expected.(Original MNIST is pretty clear)
Of course, I understand that there is little point in using Autoencoder unless you set up fewer nodes in the middle layer than in the input/output, but I would like to know why increasing the number of nodes in the middle layer cannot accurately learn equal mapping.
As in the above example of Fashion-MNIST, even if you use a network that does not reduce dimensions, it is hard to say that you can leave information well (break it?), so if you use it to reduce dimensions, it will only leave features that are not accurate?
In principle, I think 784 dimensions or higher is enough to reproduce 784 dimensions of information, but I would appreciate it if you could give me an answer if you know why it is not accurate, better implementation, and reference materials.
Additional note: Add the full code (Python 3&Keras) used to generate the above image.
from keras.datasets import fashion_mnist
from keras.layers import Input, Dense
from keras.models import Model
import matplotlib.pyplot asplt
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train=x_train.reshape (60000,784)
x_test=x_test.reshape (10000,784)
x_train=x_train.astype("float32")
x_test=x_test.astype("float32")
x_train / = 255.0
x_test / = 255.0
input_img=Input(shape=(784,))
encoded = Dense (784, activation = "relu") (input_img)
decoded=Dense(784, activation="sigmoid")(encoded)
autoencoder=Model (input=input_img, output=decoded)
autoencoder.compile(optimizer="adam", loss="binary_crossentropy")
autoencoder.fit(x_train, x_train,
epochs = 5,
batch_size = 128,
shuffle = True,
validation_data=(x_test,x_test))
# Convert test image
decoded_imgs=autoencoder.predict(x_test)
# How many should I display?
n = 10
plt.figure(figsize=(20,4))
for i in range(n):
# View original test image
ax=plt.subplot(2,n,i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
# View converted images
ax=plt.subplot(2,n,i+1+n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
© 2024 OneMinuteCode. All rights reserved.