Algorithm learning using Python KNN. Classifying handwriting.
I'm in college, but the assignment is too difficult and I don't know anyone, so I'm asking you urgently. I'm sorry for posting this suddenly. I need to fill out the code for the part marked with the shop mark, but I'm not sure. I asked you this question because I was frustrated that there was no correct answer. I would appreciate it if you could write down the explanation as well.
A practice of classifying the MNIST dataset, which stores one digit of the handwritten number as a black-and-white image of 70,000 sheets of 28x28 size, using the KNN algorithm Complete an empty code cell to run, display the results.
Numpy required for matrix processing, pyplot for graphing or plotting, random for generating random numbers, mnist for importing mnist numeric handwriting datasets, KNNeighborsClassifier for using KNN classifiers, metrics for evaluating performance, and pandas packages for easy handling matrix data.
import numpy as np # advanced math library
import matplotlib.pyplot as plt # MATLAB like plotting routines
import random # for generating random numbers
from keras.datasets import mnist # MNIST dataset is included in Keras
from sklearn.neighbors import KNeighborsClassifier
from sklearn import metrics
import pandas as pd
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print("X_train shape", X_train.shape)
print("y_train shape", y_train.shape)
print("X_test shape", X_test.shape)
print("y_test shape", y_test.shape)
y_train_df = pd.DataFrame(y_train)
print(y_train_df.value_counts())
y_test_df = pd.DataFrame(y_test)
print(y_test_df.value_counts())
plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger
for i in range(9):
plt.subplot(3,3,i+1)
num = random.randint(0, len(X_train))
plt.imshow(X_train[num], cmap='gray', interpolation='none')
plt.title("Class {}".format(y_train[num]))
plt.tight_layout()
# # just a little function for pretty printing a matrix
def matprint(mat, fmt="g"):
col_maxes = [max([len(("{:"+fmt+"}").format(x)) for x in col]) for col in mat.T]
for x in mat:
for i, y in enumerate(x):
print(("{:"+str(col_maxes[i])+fmt+"}").format(y), end=" ")
print("")
# # now print!
matprint(X_train[num])
X_train = X_train.reshape(1,-1) ###<= I'm not sure if this is the correct code.
X_test = X_test.reshape(1,-1) ###<= I'm not sure if this is the correct code.
# One handwriting image consists of a two-dimensional matrix of 28 rows and 28 columns. We're going to use this in the KNN classifier
# To input, it is converted into a one-dimensional vector. 28 rows horizontally, 784 long
# It is converted into a one-dimensional vector having elements of 28x28=784. With a total of 70,000 images,
# All of which are applied to the training input data matrix in columns 784 of 60,000 rows and columns 784 of 10,000 rows
# Create an input data matrix for testing. (Hint: Use the reshape method)
print ("Size of training data matrix (row size, column size)"; X_train.shape)
print ("Size of test data matrix (row size, column size)"; X_test.shape)
mnist
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train,X_test)
# Create a KNN classifier, input data for training, and input data for training
# Train the KNN classifier. I don't know if this is right, either.
# Put the test input data into the trained KNN classifier, and 10,000 handwritten letters are written
# It predicts which number it represents, and stores the predicted value.
# By comparing the test correct answer data with the predicted values, we obtain accuracy, precision, reproduction rate, and f1 score
# Evaluate the performance.
plt.rcParams['figure.figsize'] = (9,9) # Make the figures a bit bigger
incorrect_index = []
for i in range(10000):
if y_test[i] != y_pred[i]:
incorrect_index.append(i)
print('Index of data predicted incorrectly among 10000 test data: {}'.format(incorrect_index))
for i in range(9):
plt.subplot(3,3,i+1)
num = random.randint(0, len(incorrect_index))
plt.imshow(X_test[incorrect_index[num]].reshape(28,28), cmap='gray', interpolation='none')
plt.title("Class {}".format(y_pred[incorrect_index[num]]))
plt.tight_layout()
# Handwriting by randomly selecting 9 out of 10000 test data from incorrectly predicted data
# The picture and the predicted value are printed together.
Instead of direct answers, we will search for some links that may help you and let you know.
© 2024 OneMinuteCode. All rights reserved.