All Course > Python Machine Learning > Cnns For Image Processing Oct 28, 2024

Build a CNN for MNIST Image Classification Using Keras

In the last lesson, we explored the basics of Convolutional Neural Networks (CNNs), which are powerful tools for image processing. We learned how CNNs use filters to detect patterns like edges, textures, and shapes in images. We also discussed the roles of convolutional layers, pooling layers, and fully connected layers in building a CNN. If you missed it, I recommend revisiting Lesson 7.1 to understand the foundation before diving into this tutorial.

Table of Contents

Use-Case: Building a CNN for MNIST Image Classification
Step 1: Preparing the MNIST Dataset
Step 2: Defining the CNN Architecture
Step 3: Compiling and Training the Model
Step 4: Evaluating Model Performance
Conclusion

Now, let’s move forward and apply what we’ve learned to build a CNN for image classification using the MNIST dataset.

Use-Case: Building a CNN for MNIST Image Classification

I recently worked on a project where I needed to classify handwritten digits from the MNIST dataset. The goal was to build a model that could accurately identify digits from 0 to 9. This task is a classic example of image classification, and CNNs are perfect for it.

I started by preparing the dataset, defining the CNN architecture, and training the model. Along the way, I faced challenges like overfitting and slow training times, which I solved by tweaking the model and adjusting hyperparameters. By the end, I achieved an accuracy of over 99% on the test set. Let me walk you through the steps I took to accomplish this.

Step 1: Preparing the MNIST Dataset

The MNIST dataset contains 60,000 training images and 10,000 test images of handwritten digits. Each image is a 28x28 grayscale image, which makes it easy to work with.

First, I loaded the dataset using Keras, which provides a built-in function to fetch MNIST data. Here’s how I did it:

from tensorflow.keras.datasets import mnist  

# Load the dataset  
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Next, I reshaped the data to add a channel dimension, which is required for convolutional layers. Since the images are grayscale, the channel dimension is 1.

# Reshape the data  
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)  
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

Finally, I normalized the pixel values to the range [0, 1] to improve training efficiency.

# Normalize the data  
x_train = x_train.astype('float32') / 255  
x_test = x_test.astype('float32') / 255

Step 2: Defining the CNN Architecture

With the data ready, I defined the CNN architecture using Keras. The model consists of convolutional layers, pooling layers, and fully connected layers.

Here’s the code I used:

from tensorflow.keras.models import Sequential  
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense  

# Define the model  
model = Sequential()  

# Add convolutional layers  
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))  
model.add(MaxPooling2D(pool_size=(2, 2)))  

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))  
model.add(MaxPooling2D(pool_size=(2, 2)))  

# Flatten the output and add fully connected layers  
model.add(Flatten())  
model.add(Dense(128, activation='relu'))  
model.add(Dense(10, activation='softmax'))

The first convolutional layer has 32 filters, and the second has 64 filters. After each convolutional layer, I added a max-pooling layer to reduce the spatial dimensions. Finally, I flattened the output and added two fully connected layers. The last layer uses a softmax activation to output probabilities for each digit class.

Step 3: Compiling and Training the Model

Once the model was defined, I compiled it by specifying the optimizer, loss function, and metrics.

# Compile the model  
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

I used the Adam optimizer, which is efficient and works well for most tasks. The loss function is sparse categorical crossentropy, which is suitable for multi-class classification.

Next, I trained the model on the training data:

# Train the model  
model.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.1)

I trained the model for 5 epochs with a batch size of 128. I also set aside 10% of the training data for validation.

Step 4: Evaluating Model Performance

After training, I evaluated the model on the test set to check its performance.

# Evaluate the model  
test_loss, test_acc = model.evaluate(x_test, y_test)  
print(f"Test accuracy: {test_acc:.4f}")

The model achieved an accuracy of over 99%, which is excellent for this task.

Conclusion

In this tutorial, we walked through the steps to build a CNN for image classification using the MNIST dataset. We prepared the data, defined the CNN architecture, trained the model, and evaluated its performance. By following these steps, you can build your own CNN for similar tasks.

In the next lesson, we’ll explore data augmentation and transfer learning, which are techniques to improve model performance and handle more complex datasets. Don’t miss it!

Comments

There are no comments yet.

Modules