Ciro Santilli @cirosantilli 37

 Incoming links: PyTorch

activatedgeek/LeNet-5 Updated 2025-07-16

This repository contains a very clean minimal PyTorch implementation of LeNet-5 for MNIST.

It trains the LeNet-5 neural network on the MNIST dataset from scratch, and afterwards you can give it newly hand-written digits 0 to 9 and it will hopefully recognize the digit for you.

Ciro Santilli created a small fork of this repo at lenet adding better automation for:

extracting MNIST images as PNG
ONNX CLI inference taking any image files as input
a Python tkinter GUI that lets you draw and see inference live
running on GPU

Install on Ubuntu 24.10 with:

sudo apt install protobuf-compiler
git clone https://github.com/activatedgeek/LeNet-5
cd LeNet-5
git checkout 95b55a838f9d90536fd3b303cede12cf8b5da47f
virtualenv -p python3 .venv
. .venv/bin/activate
pip install \
  Pillow==6.2.0 \
  numpy==1.24.2 \
  onnx==1.13.1 \
  torch==2.0.0 \
  torchvision==0.15.1 \
  visdom==0.2.4 \
;

We use our own pip install because their requirements.txt uses >= instead of == making it random if things will work or not.

On Ubuntu 22.10 it was instead:

pip install
  Pillow==6.2.0 \
  numpy==1.26.4 \
  onnx==1.17.0 torch==2.6.0 \
  torchvision==0.21.0 \
  visdom==0.2.4 \
;

Then run with:

python run.py

This script:

does a fixed 15 epochs on the training data
it then uses the trained net from memory to check accuracy with the test data
then it also produces a lenet.onnx ONNX file which contains the trained network, nice!

It throws a billion exceptions because we didn't start the Visdom server, but everything works nevertheless, we just don't get a visualization of the training.

The terminal outputs lines such as:

Train - Epoch 1, Batch: 0, Loss: 2.311587
Train - Epoch 1, Batch: 10, Loss: 2.067062
Train - Epoch 1, Batch: 20, Loss: 0.959845
...
Train - Epoch 1, Batch: 230, Loss: 0.071796
Test Avg. Loss: 0.000112, Accuracy: 0.967500
...
Train - Epoch 15, Batch: 230, Loss: 0.010040
Test Avg. Loss: 0.000038, Accuracy: 0.989300

And the runtime on Ubuntu 22.10, P51 was:

real    2m10.262s
user    11m9.771s
sys     0m26.368s

One of the benefits of the ONNX output is that we can nicely visualize the neural network on Netron:

Figure 1.
Netron visualization of the activatedgeek/LeNet-5 ONNX output
. From this we can see the bifurcation on the computational graph as done in the code at:
`output = self.c1(img) x = self.c2_1(output) output = self.c2_2(output) output += x output = self.c3(output)`
This doesn't seem to conform to the original LeNet-5 however?

 Read the full article

activatedgeek/LeNet-5 run on GPU Updated 2025-07-16

 View more

By default, the setup runs on CPU only, not GPU, as could be seen by running htop. But by the magic of PyTorch, modifying the program to run on the GPU is trivial:

cat << EOF | patch
diff --git a/run.py b/run.py
index 104d363..20072d1 100644
--- a/run.py
+++ b/run.py
@@ -24,7 +24,8 @@ data_test = MNIST('./data/mnist',
 data_train_loader = DataLoader(data_train, batch_size=256, shuffle=True, num_workers=8)
 data_test_loader = DataLoader(data_test, batch_size=1024, num_workers=8)

-net = LeNet5()
+device = 'cuda'
+net = LeNet5().to(device)
 criterion = nn.CrossEntropyLoss()
 optimizer = optim.Adam(net.parameters(), lr=2e-3)

@@ -43,6 +44,8 @@ def train(epoch):
     net.train()
     loss_list, batch_list = [], []
     for i, (images, labels) in enumerate(data_train_loader):
+        labels = labels.to(device)
+        images = images.to(device)
         optimizer.zero_grad()

         output = net(images)
@@ -71,6 +74,8 @@ def test():
     total_correct = 0
     avg_loss = 0.0
     for i, (images, labels) in enumerate(data_test_loader):
+        labels = labels.to(device)
+        images = images.to(device)
         output = net(images)
         avg_loss += criterion(output, labels).sum()
         pred = output.detach().max(1)[1]
@@ -84,7 +89,7 @@ def train_and_test(epoch):
     train(epoch)
     test()

-    dummy_input = torch.randn(1, 1, 32, 32, requires_grad=True)
+    dummy_input = torch.randn(1, 1, 32, 32, requires_grad=True).to(device)
     torch.onnx.export(net, dummy_input, "lenet.onnx")

     onnx_model = onnx.load("lenet.onnx")
EOF

and leads to a faster runtime, with less user as now we are spending more time on the GPU than CPU:

real    1m27.829s
user    4m37.266s
sys     0m27.562s

 Read the full article

CNN convolution kernels are also learnt Updated 2025-07-16

 View more

CNN convolution kernels are not hardcoded. They are learnt and optimized via backpropagation. You just specify their size! Example in PyTorch you'd do just:

nn.Conv2d(1, 6, kernel_size=(5, 5))

as used for example at: activatedgeek/LeNet-5.

This can also be inferred from: stackoverflow.com/questions/55594969/how-to-visualise-filters-in-a-cnn-with-pytorch where we see that the kernels are not perfectly regular as you'd expected from something hand coded.

 Read the full article

Conda Updated 2025-07-16

 View more

Conda is like pip, except that it also manages shared library dependencies, including providing prebuilts.

This has made Conda very popular in the deep learning community around 2020, where using Python frontends like PyTorch to configure faster precompiled backends was extremely common.

It also means that it is a full package manager and extremely overbloated and blows up all the time. People should just use Docker instead for that kind of stuff: www.reddit.com/r/learnmachinelearning/comments/kd88p8/comment/keco07k/

You also have to buy a license to use their repos if you are part of a large-enough organization: stackoverflow.com/questions/74762863/are-conda-miniconda-and-anaconda-free-to-use-and-open-source

 Read the full article

GPT-2 implementation in PyTorch 2025-08-08

 Read the full article

MNIST database Updated 2025-07-16

 View more

70,000 28x28 grayscale (1 byte per pixel) images of hand-written digits 0-9, i.e. 10 categories. 60k are considered training data, 10k are considered for test data.

This is THE "OG" computer vision dataset.

Playing with it is the de-facto computer vision hello world.

It was on this dataset that Yann LeCun made great progress with the LeNet model. Running LeNet on MNIST has to be the most classic computer vision thing ever. See e.g. activatedgeek/LeNet-5 for a minimal and modern PyTorch educational implementation.

But it is important to note that as of the 2010's, the benchmark had become too easy for many applications. It is perhaps fair to say that the next big dataset revolution of the same importance was with ImageNet.

The dataset could be downloaded from yann.lecun.com/exdb/mnist/ but as of March 2025 it was down and seems to have broken from time to time randomly, so Wayback Machine to the rescue:

wget \
 https://web.archive.org/web/20120828222752/http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz \
 https://web.archive.org/web/20120828182504/http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz \
 https://web.archive.org/web/20240323235739/http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz \
 https://web.archive.org/web/20240328174015/http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

but doing so is kind of pointless as both files use some crazy single-file custom binary format to store all images and labels. OMG!

OK-ish data explorer: knowyourdata-tfds.withgoogle.com/#tab=STATS&dataset=mnist

 Read the full article

ONNX Updated 2025-07-16

 View more

The most important thing this project provides appears to be the .onnx file format, which represents ANN models, pre-trained or not.

Deep learning frameworks can then output such .onnx files for interchangeability and serialization.

Some examples:

activatedgeek/LeNet-5 produces a trained .onnx from PyTorch
MLperf v2.1 ResNet can use .onnx as a pre-trained model

The cool thing is that ONNX can then run inference in an uniform manner on a variety of devices without installing the deep learning framework used for. It's a bit like having a kind of portable executable. Neat.

 Read the full article

PyTorch model Updated 2025-08-08

 View more

This section lists specific models that have been implemented in PyTorch.

 Read the full article