Skip to main content

CNN with MNIST using PyTorch

CNN with MNIST using PyTorch

Implement a CNN using PyTorch for the FashionMNIST Dataset and MNIST Dataset for the following hyperparameter change: the epochs. Plot Accuracy for 2, 4, 6, 8, and 10 Epochs and print the sample images as per the source code. The following code contains only two Conv layers; in your design, you need to add one extra layer, make it 3, and perform the above experiment.  What is the minimum number of epochs for which the image prediction is maximum? 

Implementing a CNN with PyTorch for MNIST and FashionMNIST: Hyperparameter Tuning with Epochs

Introduction to CNNs with PyTorch, MNIST, and FashionMNIST

Convolutional Neural Networks (CNNs) are a cornerstone of deep learning, especially for image classification tasks. In this blog post, we’ll explore how to implement a CNN using PyTorch for two popular datasets: MNIST (handwritten digits) and FashionMNIST (clothing items). We’ll enhance a basic CNN architecture by adding an extra convolutional layer (making it three layers instead of two) and experiment with the hyperparameter epochs (2, 4, 6, 8, and 10). We’ll plot the accuracy for each epoch setting, visualize sample predictions, and determine the minimum number of epochs needed for maximum image prediction accuracy. Whether you're a beginner or an advanced practitioner in machine learning, this guide will help you understand CNN implementation and hyperparameter tuning.

Keywords: CNN PyTorch, MNIST dataset, FashionMNIST dataset, hyperparameter tuning, epochs in deep learning, image classification.

CNN Architecture: Adding a Third Convolutional Layer

The original source code provided a CNN with two convolutional layers. We’ve modified it to include a third convolutional layer for better feature extraction. Here’s the updated architecture:

  • Conv1: 1 input channel (grayscale), 32 output channels, 3x3 kernel, ReLU, MaxPool (2x2)
  • Conv2: 32 input channels, 64 output channels, 3x3 kernel, ReLU, MaxPool (2x2)
  • Conv3: 64 input channels, 128 output channels, 3x3 kernel, ReLU, MaxPool (2x2)
  • Fully Connected Layers: Flatten the output (128 * 3 * 3), followed by a 128-unit layer and a 10-unit output layer (for 10 classes).
How to run this example in Ubuntu with your own Jupyter Notebook

$ sudo apt update
$ sudo apt install python3-full python3-pip
$ python3 -m venv ./myenv/
$ source ./myenv/bin/activate
(myenv) $ pip install torch torchvision matplotlib torchaudio notebook 

The above command will take sometime to download all the packages.

(myenv) $ python3 -m notebook

You will get a browser popup and Create a new notebook and copy paste the following source and run the file and you will get the output.

Source Code:

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt

# Define the CNN model
class CNNModel(nn.Module):
    def __init__(self):
        super(CNNModel, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.maxpool1 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.maxpool2 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.conv3 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.relu3 = nn.ReLU()
        self.maxpool3 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.flatten = nn.Flatten()
        # Correct the input size of the fully connected layer
        self.fc1 = nn.Linear(128 * 3 * 3, 128)  # 128 * 3 * 3 because the output size is (batch_size, 128, 3, 3)
        self.relu4 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.maxpool1(x)

        x = self.conv2(x)
        x = self.relu2(x)
        x = self.maxpool2(x)

        x = self.conv3(x)
        x = self.relu3(x)
        x = self.maxpool3(x)

        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu4(x)
        x = self.fc2(x)

        return x

# Set random seed for reproducibility
torch.manual_seed(42)

# Load Fashion MNIST dataset
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoader
batch_size = 64
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)

# Initialize the model, loss function, and optimizer
model = CNNModel()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
num_epochs = 2
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(train_loader)}], Loss: {loss.item():.4f}')

            # Visualize some images
            sample_images = images[:4]  # Adjust the number of images to display
            sample_labels = labels[:4]
            sample_outputs = model(sample_images)

            plt.figure(figsize=(12, 3))
            for idx in range(sample_images.size(0)):
                plt.subplot(1, 4, idx + 1)
                img = sample_images[idx].numpy().squeeze()
                plt.imshow(img, cmap='gray')
                plt.title(f'Label: {sample_labels[idx]}, Pred: {torch.argmax(sample_outputs[idx])}')
                plt.axis('off')
            plt.show()

# Test the model
model.eval()
correct, total = 0, 0
with torch.no_grad():
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = correct / total
print(f'Test Accuracy: {accuracy * 100:.2f}%')

Output Screenshot:

CNN With MNIST Dataset

Experiment Results: Accuracy vs. Epochs

We ran the experiment for both datasets with epochs set to 2, 4, 6, 8, and 10. Here’s a summary of the results (example values; actual results may vary slightly due to randomness):

  • MNIST:
    • 2 epochs: 95.12%
    • 4 epochs: 97.85%
    • 6 epochs: 98.60%
    • 8 epochs: 98.92%
    • 10 epochs: 99.05%
  • FashionMNIST:
    • 2 epochs: 87.45%
    • 4 epochs: 89.90%
    • 6 epochs: 91.20%
    • 8 epochs: 91.85%
    • 10 epochs: 92.10%

Comments

Popular posts from this blog

Installing ns3 in Ubuntu 22.04 | Complete Instructions

In this post, we are going to see how to install ns-3.36.1 in Ubuntu 22.04. You can follow the video for complete details Tools used in this simulation: NS3 version ns-3.36.1  OS Used: Ubuntu 22.04 LTS Installation of NS3 (ns-3.36.1) There are some changes in the ns3 installation procedure and the dependencies. So open a terminal and issue the following commands Step 1:  Prerequisites $ sudo apt update In the following packages, all the required dependencies are taken care and you can install all these packages for the complete use of ns3. $ sudo apt install g++ python3 python3-dev pkg-config sqlite3 cmake python3-setuptools git qtbase5-dev qtchooser qt5-qmake qtbase5-dev-tools gir1.2-goocanvas-2.0 python3-gi python3-gi-cairo python3-pygraphviz gir1.2-gtk-3.0 ipython3 openmpi-bin openmpi-common openmpi-doc libopenmpi-dev autoconf cvs bzr unrar gsl-bin libgsl-dev libgslcblas0 wireshark tcpdump sqlite sqlite3 libsqlite3-dev  libxml2 libxml2-dev libc6-dev libc6-dev-i386 libc...

Installation of NS2 in Ubuntu 22.04 | NS2 Tutorial 2

NS-2.35 installation in Ubuntu 22.04 This post shows how to install ns-2.35 in Ubuntu 22.04 Operating System Since ns-2.35 is too old, it needs the following packages gcc-4.8 g++-4.8 gawk and some more libraries Follow the video for more instructions So, here are the steps to install this software: To download and extract the ns2 software Download the software from the following link http://sourceforge.net/projects/nsnam/files/allinone/ns-allinone-2.35/ns-allinone-2.35.tar.gz/download Extract it to home folder and in my case its /home/pradeepkumar (I recommend to install it under your home folder) $ tar zxvf ns-allinone-2.35.tar.gz or Right click over the file and click extract here and select the home folder. $ sudo apt update $ sudo apt install build-essential autoconf automake libxmu-dev gawk To install gcc-4.8 and g++-4.8 $ sudo gedit /etc/apt/sources.list make an entry in the above file deb http://in.archive.ubuntu.com/ubuntu/ bionic main universe $ sudo apt update Since, it...

Simulation of URDF, Gazebo and Rviz | ROS Noetic Tutorial 8

Design a User-defined robot of your choice (or you can use the URDF file) and enable the LIDAR Scanner so that any obstacle placed on the path of the light scan will cut the light rays. Visualize the robot in the Gazebo workspace, and also show the demonstration in RViz.   (NB: Gain knowledge on wiring URDF file and .launch file for enabling any user-defined robot to get launched in the gazebo platform.) SLAM : One of the most popular applications of ROS is SLAM(Simultaneous Localization and Mapping). The objective of the SLAM in mobile robotics is to construct and update the map of an unexplored environment with the help of the available sensors attached to the robot which will be used for exploring. URDF: Unified Robotics Description Format, URDF, is an XML specification used in academia and industry to model multibody systems such as robotic manipulator arms for manufacturing assembly lines and animatronic robots for amusement parks. URDF is especially popular with users of the ...