Rubik's Cube Solving Robot
In this project, we will develop a classifier for the MNIST handwritten digit dataset using a neural network architecture. As an added challenge, we will derive and implement the model from scratch, avoiding high-level routines provided by TensorFlow, pyTorch, Keras, etc.
Introduction
In this project, we delve into the world of neural networks by building a classifier for handwritten digits using the MNIST dataset. We will explore the architecture of neural networks, derive the mathematical foundations, and implement the model in Python.
Examples of Handwritten Digits
Below are examples of images from the MNIST dataset along with their corresponding labels.
Neural Network Architecture
We will use a simple feedforward neural network with one hidden layer. The architecture is as follows:
- Input Layer: 784 neurons (28x28 pixels)
- Hidden Layer: 128 neurons
- Output Layer: 10 neurons (digits 0-9)
The network uses the sigmoid activation function and is trained using stochastic gradient descent.
Mathematical Derivation
The output of the neural network is computed as follows:
$$\begin{align*} \mathbf{z}^{(1)} &= \mathbf{W}^{(1)} \mathbf{x} + \mathbf{b}^{(1)} \\ \mathbf{a}^{(1)} &= \sigma(\mathbf{z}^{(1)}) \\ \mathbf{z}^{(2)} &= \mathbf{W}^{(2)} \mathbf{a}^{(1)} + \mathbf{b}^{(2)} \\ \mathbf{a}^{(2)} &= \sigma(\mathbf{z}^{(2)}) \end{align*}$$
Where:
- \(\mathbf{x}\) is the input vector.
- \(\mathbf{W}^{(l)}\) and \(\mathbf{b}^{(l)}\) are the weights and biases of layer \(l\).
- \(\sigma(z)\) is the sigmoid activation function: \(\sigma(z) = \frac{1}{1 + e^{-z}}\).
The loss is calculated using the cross-entropy function, and gradients are computed for backpropagation.
Python Implementation
The following Python code demonstrates the implementation of the neural network described above.
import numpy as np
class NeuralNetwork:
def __init__(self):
# Initialize weights and biases
self.W1 = np.random.randn(128, 784) * 0.01
self.b1 = np.zeros((128, 1))
self.W2 = np.random.randn(10, 128) * 0.01
self.b2 = np.zeros((10, 1))
def sigmoid(self, z):
return 1 / (1 + np.exp(-z))
def feedforward(self, x):
z1 = np.dot(self.W1, x) + self.b1
a1 = self.sigmoid(z1)
z2 = np.dot(self.W2, a1) + self.b2
a2 = self.sigmoid(z2)
return a2
This class initializes the network parameters and defines methods for the activation function and feedforward computation.
Conclusion
We have explored the fundamentals of neural networks by building and training a model to classify handwritten digits. The concepts and implementations provided lay the groundwork for more advanced studies in deep learning.