activatedgeek/LeNet-5 by Ciro Santilli 40 Updated 2025-07-16
This repository contains a very clean minimal PyTorch implementation of LeNet-5 for MNIST.
It trains the LeNet-5 neural network on the MNIST dataset from scratch, and afterwards you can give it newly hand-written digits 0 to 9 and it will hopefully recognize the digit for you.
Ciro Santilli created a small fork of this repo at lenet adding better automation for:
Install on Ubuntu 24.10 with:
sudo apt install protobuf-compiler
git clone https://github.com/activatedgeek/LeNet-5
cd LeNet-5
git checkout 95b55a838f9d90536fd3b303cede12cf8b5da47f
virtualenv -p python3 .venv
. .venv/bin/activate
pip install \
  Pillow==6.2.0 \
  numpy==1.24.2 \
  onnx==1.13.1 \
  torch==2.0.0 \
  torchvision==0.15.1 \
  visdom==0.2.4 \
;
We use our own pip install because their requirements.txt uses >= instead of == making it random if things will work or not.
On Ubuntu 22.10 it was instead:
pip install
  Pillow==6.2.0 \
  numpy==1.26.4 \
  onnx==1.17.0 torch==2.6.0 \
  torchvision==0.21.0 \
  visdom==0.2.4 \
;
Then run with:
python run.py
This script:
  • does a fixed 15 epochs on the training data
  • it then uses the trained net from memory to check accuracy with the test data
  • then it also produces a lenet.onnx ONNX file which contains the trained network, nice!
It throws a billion exceptions because we didn't start the Visdom server, but everything works nevertheless, we just don't get a visualization of the training.
The terminal outputs lines such as:
Train - Epoch 1, Batch: 0, Loss: 2.311587
Train - Epoch 1, Batch: 10, Loss: 2.067062
Train - Epoch 1, Batch: 20, Loss: 0.959845
...
Train - Epoch 1, Batch: 230, Loss: 0.071796
Test Avg. Loss: 0.000112, Accuracy: 0.967500
...
Train - Epoch 15, Batch: 230, Loss: 0.010040
Test Avg. Loss: 0.000038, Accuracy: 0.989300
And the runtime on Ubuntu 22.10, P51 was:
real    2m10.262s
user    11m9.771s
sys     0m26.368s
One of the benefits of the ONNX output is that we can nicely visualize the neural network on Netron:
Figure 1.
Netron visualization of the activatedgeek/LeNet-5 ONNX output
. From this we can see the bifurcation on the computational graph as done in the code at:
output = self.c1(img)
x = self.c2_1(output)
output = self.c2_2(output)
output += x
output = self.c3(output)
This doesn't seem to conform to the original LeNet-5 however?
People will be more interested if they see how the stuff they are learning is useful.
Useful 99% of the time means you can make money with it.
Achieving novel results for science, or charitable goals (e.g. creating novel tutorials) are also equaly valid. Note that those also imply you being able to make a living out of something, just that you will be getting donations and not become infinitey rich. and that is fine.
Projects don't need of course to reach the level of novel result. But they must at least aim at moving towards that.
This is one of the greatest challenges of education, since a huge part of the useful information is locked under enterprise or military secrecy, or even open academic incomprehensibility, making it nearly to impossible for the front-line educators to actually find and teach real use cases.
In degrees Celsius:
  • 25+
    • palm tree shade and coconut water. Seriously though, if there's some shade or earlier morning/later afternoon it's OK, but if it's on an open road at midday, be careful, and stop early if you start getting slightly dizzy, it only gets worse!
  • 18-25
  • 15-18:
  • 10-15:
    • dhb Classic Thermal Bib Tights 10 and under. TODO this is a bit too warm for the upper range, need something more intermediate
    • "dhb Lightweight Mesh Long Sleeve Base Layer"
    • Castelli Perfetto RoS Long Sleeve - Cycling jersey. TODO this is a bit too warm for the upper range, need something more intermediate
    • "Karrimor X Lite Run Black Headband"
    • "Nike academy hyperwarm gloves"
    • "Nevica Skuff". A bit too hot on upper range, but easy to take off.
  • 0-10:
    • dhb Merino Long Sleeve Base Layer
    • Castelli Perfetto RoS Long Sleeve - Cycling jersey
    • dhb Classic Thermal Bib Tights 10 and under
    • dhb Dorica MTB Shoe (2020-12)
    • "Karrimor X Lite Run Black Headband". Head a bit cold on lower range.
    • "dhb Neoprene Nylon Overshoes". Feet a bit cold on lower range.
    • "Extremities XDRY gloves". Hands a bit cold on lower range.
    • "Nevica Skuff"
ImageNet subset by Ciro Santilli 40 Updated 2025-07-16
Subset generators:
Unfortunately, since ImageNet is a closed standard no one can upload such pre-made subsets, forcing everybody to download the full dataset, in ImageNet1k, which is huge!
This is one of the deep tech bets that Ciro Santilli would put his money in as of 2020.
How hard could it be? You just have to learn the encoding of the neural spine/eyes/ear, add an invasive device that multiplexes it, and then the benefits could be mind blowing.
Interestingly and obviously, the initial advances in the area are happening for people that have hearing or vision difficulties. Since they already have a deficient sense, you don't lose that much by a failed attempt.
Hearing is likely to be the first since it feels the simplest. Ciro heard there are even already clinical applications there. TODO source.
A quote by Ciro's Teacher R.:
Sometimes, even if our end goals are too far from reality, the side effects of trying to reach them can have meaningful impact.
If the goals are not ambitious enough, you risk not even having useful side effects so show in the end!
By doing the prerequisites of the impossible goal you desire, maybe the next generation will be able to achieve it.
This is basically why Ciro Santilli has contributed to Stack Overflow, which has happened while was doing his overly ambitious projects and notice that all kinds of basic pre-requisites were not well explained anywhere.
This is especially effective when you use backward design, because then you will go "down the dependency graph of prerequisites" and smoothen out any particularly inefficient points that you come across.
Going into such productive procrastination is also known informally as yak shaving.
There are of course countless examples of such events:
The danger of this approach is of course spending too much time on stuff that will not be done enough times to be worth it, as highlighted by several xkcds:
Figure 1.
xkcd 974: The general problem
. Source.
Figure 2.
xkcd 1205: Is it worth the time
. Source.
Skills by Ciro Santilli 40 Updated 2025-07-16
Non-technical skills were moved to: Ciro Santilli's skills.
This has not been updated since 2016 after Ciro got a job, because it is too hard to put a number on any skill.
Ciro like to interpret this as him having "a creative personality" with the tradeoff of generally not being amazing at his well defined jobs.
Ciro is a high flying bird scientist. As mentioned at by Tommaso Fontana at zom.wtf/about/
I'm what happened when you can't choose a single career path
Ciro is obsessed by that which is "quirky". This also often has a parallel with "naughty". He often fantasizes about an imaginary parallel between that feeling and Jobs and Wozniak's blue box.
Ciro's natural fight-or-flight response is to hide in a little corner, and try to solve the problem out. Then get distracted and start procrastinating. And then he tries to solve the unsolvable. Someone Ciro barely new once told him quite correctly:
In the event of war, you would be the type that hides away and makes the bombs.
This is also perhaps why Ciro likes prison decks in Magic: The Gathering. You just sit on your corner, making yourself safer and safer, until the opponent can't do you any harm and concedes.
There are of course infinitely many videos on the "entrepreneurial mindset" online, and it is impossible to know if they are bullshit, or if everyone just feels like that, but OK, just let Ciro feels that he is specially creative will you?
In the words of Rob Pike[ref]:
mostly building weird stuff no one uses, but occasionally getting it right, such as with UTF-8 and Go
Video 1.
What Predicts Academic Ability? by Jordan B Peterson (2017)
Source. Good quotes:
Creative people continuously step outside of the domain of evaluation structures
and:
If you are creative and you go off on tangents all the time, there's some probability that one of those tangents is going to be exactly what is needed at the time, and you are going to become hyper-successful as a consequence
[but the probability of that being the right time and place for the idea is extraordinarily low]
The sensible thing to tell anybody is "you shouldn't do it, your probability of success is so low, that its better to just to something sensible".
But the problem with that, is that creative people can't do that, because they are creative. A creative person who isn't being creative, they just wither and die.
Which brings Here's to the crazy ones to mind.
Ciro also one heard a story, likely apocryphal, but still nonetheless resonated with him, that went something like this (TODO find source, Google wasn't helping, stuff that happened before website as usual):
The newly hired manager of some subsection of DuPont (or some other gigantic chemical company) came into the office, and found a chemical engineer, completely drunk in the middle of the day.
Outraged, the manager searched for this colleagues who explained.
Ah, don't mind John (or some other name), the guy invented Teflon (or some other substance) which accounted for 20% of our revenue last year. Even if he does not do anything else in his entire career, his salary won't make any difference compared to those gains, and we take the chance that he might invent something else later.
Ciro likes this story because although he does not drink, he feels his work mind works in a related way. Often, when there is something really hard he knows needs doing he hides, and distracts himself with less important tasks, or by watching crap on YouTube, because he knows that the hard task will hurt his mind. Then one day he wakes up and says: OK, fuck it, let's do it, and does it.
Once Ciro got a performance review from a colleague that said:
If Ciro spent as much effort on his job as he does on side projects, he'd be the most amazing worker.
This is closely related to effortless effort.
Yes, low conscientiousness, give it to me.
Video 2.
And I am not and never have been 'familiar' scene from The Big Short (2015)
. Source.
People want an authority to tell them how to value things, but they choose this authority not based on facts or results. They choose it because it seems authoritative and familiar. And I am not and never have been familiar.
blog.sbensu.com/posts/high-variance-management/ High Variance Management:
Like movies, software projects have parts that require high variance and parts that don't. For most projects, the logging system can be off-the-shelf and predictable. But core parts of the product that require novel design should be as good as they can be.
Ciro's ideal city to live in contains the following in order of decreasing importance:
Could California be Ciro's Mecca?

Pinned article: Introduction to the OurBigBook Project

Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
We have two killer features:
  1. topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculus
    Articles of different users are sorted by upvote within each article page. This feature is a bit like:
    • a Wikipedia where each user can have their own version of each article
    • a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
    This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.
    Figure 1.
    Screenshot of the "Derivative" topic page
    . View it live at: ourbigbook.com/go/topic/derivative
  2. local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:
    This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
    Figure 2.
    You can publish local OurBigBook lightweight markup files to either https://OurBigBook.com or as a static website
    .
    Figure 3.
    Visual Studio Code extension installation
    .
    Figure 4.
    Visual Studio Code extension tree navigation
    .
    Figure 5.
    Web editor
    . You can also edit articles on the Web editor without installing anything locally.
    Video 3.
    Edit locally and publish demo
    . Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.
    Video 4.
    OurBigBook Visual Studio Code extension editing and navigation demo
    . Source.
  3. https://raw.githubusercontent.com/ourbigbook/ourbigbook-media/master/feature/x/hilbert-space-arrow.png
  4. Infinitely deep tables of contents:
    Figure 6.
    Dynamic article tree with infinitely deep table of contents
    .
    Descendant pages can also show up as toplevel e.g.: ourbigbook.com/cirosantilli/chordate-subclade
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact