Kenneth Griffin PhD, PE - Image Classification - Fashion-MNIST Dataset - Neural Network vs. Convolutional Neural Network

Methods: Python

Data Analysis: Pandas, Numpy,

Data Visualization: Matplotlib, Pydot, Graphviz

Data Modeling: Keras, Tensorflow, NN, CNN

Web Development: HTML, CSS

Image Classification with the Fashion-MNIST Dataset - Neural Network vs. Convolutional Neural Network

This exercise demonstrates image classification of the Fashion-MNIST dataset using Tensorflow Keras Neural Network and a Convolutional Neural Network (CovNet or CNN). The coding exercise is provided in the python jupyter notebook below.
The Fashion-MNIST Dataset Github notes, "Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes."
The dataset is loaded from Keras in this exercise.
The neural network was determined to have an accuracy of ~88% versus the convolutional neural network accuracy of ~91%. The State of the Art Fashion-MNIST model accuracy is approximately 96.91% according to the SOTA leaderboard.

Tables and graphs can be found here: Image Classification with the Fashion-MNIST Dataset - Neural Network vs. Convolutional Neural Network Tensorflow Keras Python Jupyter Notebook.

Results

-The full dataset contains 60,000 examples in the trianing dataset and 10,000 examples in the test dataset. Each example is an image that is 28x28 grayscale image. Each image is classified into one of ten classes.

-The ten classes are t-shirt/top, trouser, pullover, dress, coat, sandal, shirt, sneaker, bag, and ankle boot.

-The raw data for each image is comprised of numbers 0 (black pixel) to 255 (white pixel). If the pixel was a color image then it would contain RGB (red, green, blue).

-A neural network was built in Tensorflow Keras and was found to have an accuracy of approximately ~88%. The model was built with a single hidden layer with 256 ReLU neurons. The output layer is a multi-class classification problem that requires the network to predict the probability it is one of the 10 categories. A Softmax Activation Function provides the probability (between 0 and 1) for each variable and the total sum of all the variables is equal to 1.

-A convolutional neural network was built in Tensorflow Keras and was found to have an accuracy of approximately ~91%. The model was built with the input layer, convolutional layer 1, max pooling2d, convolutional layer 2, max pooling2d, and the output softmax activation function.

-The neural network and CovNet both utilized the Adam optimizer, sparse_categorical_crossentropy loss function since the models' output is categorical with 10 levels, and accuracy metric since it is a classification problem.

-The State of the Art Fashion-MNIST model accuracy is approximately 96.91% according to the SOTA leaderboard.

-Both the NN and CNN could potentially be improved upon by trying different combinations of epochs, batch_size, and other optimizers.

-These models could be used to identify other images, as well as use techniques to built other models.

-Convolutional neural networks (ConvNets or CNNs) were specifically design to work with image classification and computer vision. More information about CNNs can be found here at DataCamp.com Introduction to Convolutional Neural Networks or IBM Convolutional Neural Networks information.

-The IBM Convolutional Neural Networks website notes, "Convolutional neural networks use three-dimensional data for image classification and object recognition tasks. Convolutional neural networks are distinguished from other neural networks by their superior performance with image, speech, or audio signal inputs. They have three main types of layers, which are: 1) Convolutional layer, 2) Pooling layer, and 3) Fully-connected (FC) layer."

Image Classification with the Fashion-MNIST Dataset - Neural Network vs. Convolutional Neural Network

Results

Contact