Recognizing Digits with 99% Accuracy using CNN

3 min readJun 11, 2021

This article talks about a competition in Kaggle in which CNN was used for Digit Recognition.

Reading and Loading the Data

The dataset contains gray-scale images of hand-written digits from 0 to 9. Every image in the dataset is made up of pixels with values in the range of 0 to 255 where each number indicates the brightness of the pixel with 0 being taken as black, 255 as white and the numbers in between as different shades of grey.

Importing Required Packages and Libraries

Splitting the Data

The dataset was split into two sets as X and Y where Y indicated the labels of the images. The size of the two sets were then found which should be equal in number.

X and Y were further split into training and test data. 1000 images were taken as test set and the rest as training set.

Reshaping the Input

All the images were 28 pixels in height and 28 pixels in width with each image having 784 pixels in total and since gray scale images can take only one channel, the images were reshaped into 28*28*1 3D matrices.

Data Augmentation

This was used to create slightly modified images of the training set data to increase the number of training images, which would reduce the problem of overfitting. The model memorizing the training set and running well on it but does not perform well when tested on a new dataset is known as overfitting.

As mentioned in the code above, Image Data Generator was used to perform data augmentation. The images were first normalized by dividing by 255. Image modifications included rotation by 20 degrees, random zoom by 20% and horizontal and vertical shift by 10%

Convolution Neural Network (CNN)

The following properties were included in the CNN model to reduce overfitting, generalization error and improve performance:

Batch Normalization: the process of standardizing and normalizing the inputs of a layer to ensure that neural networks work faster and to keep the network stable even when more and more layers are added. The word ‘batch’ indicates that the input is divided into batches and so normalization occurs separately on every batch.

Dropout: the process of randomly shutting down some neurons in each iteration to train a different model in each iteration that does not depend on any one specific neuron.