yolov5-pip by Ciro Santilli 40 Updated 2025-07-16
OK, now we're talking, two liner and you get a window showing bounding box object detection from your webcam feed!
python -m pip install -U yolov5==7.0.9
yolov5 detect --source 0
The accuracy is crap for anything but people. But still. Well done. Tested on Ubuntu 22.10, P51.
Video 1.
fcakyon/yolov5-pip webcam object detection demo by Ciro Santilli (2023)
. Source.
Fashion MNIST by Ciro Santilli 40 Updated 2025-07-16
Same style as MNIST: 28x28 grayscale images, but with clothes rather than hand written digits.
It was designed to be much harder than MNIST, and more representative of modern applications, while still retaining the low resolution of MNIST for simplicity of training.
https://web.archive.org/web/20250511105702im_/https://github.com/zalandoresearch/fashion-mnist/raw/master/doc/img/fashion-mnist-sprite.png
Neil Fernandez by Ciro Santilli 40 Updated 2025-07-16
“Especially my father. He was doing most of it and he is a savoury, strong character. He has strong beliefs about the world and in himself, and he was helping me a lot, even when I was at university as an undergraduate.”
An only child, Arran was born in 1995 in Glasgow, where his parents were studying at the time. His father has Spanish lineage, having a great grandfather who was a sailor who moved from Spain to St Vincent in the Carribean. A son later left the islands for the UK where he married an English woman. Arran’s mother is Norwegian.
“My father was writing and my mother is an economist. They both worked from home which also made things easier,” Arran says.
A bit like what Ciro Santilli feels about himself!
One of the articles says his father has a PhD. TODO where did he work? What's his PhD on? Photo: www.topfoto.co.uk/asset/1357880/
www.thetimes.co.uk/article/the-everyday-genius-pxsq5c50kt9:
Neil, a political economist, attended state and private schools in Hampshire but was also taught for a period at home by his mother.
It’s strange because for most people maths is a real turn-off, yet maths is all about patterns and children of two or three love patterns. It just shows that schools are doing something seriously wrong.”
Homological stability is a concept in algebraic topology and representation theory that deals with the behavior of homological groups of topological spaces or algebraic structures as their dimensions or parameters vary. The basic idea is that for a sequence of spaces \(X_n\) (or groups, schemes, etc.), as \(n\) increases, the homological properties of these spaces become stable in a certain sense.
CIFAR-10 by Ciro Santilli 40 Updated 2025-07-16
60,000 tiny 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
TODO release date.
This dataset can be thought of as an intermediate between the simplicity of MNIST, and a more full blown ImageNet.
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/airplane1.png
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/automobile1.png
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/bird1.png
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/cat1.png
ImageNet subset by Ciro Santilli 40 Updated 2025-07-16
Subset generators:
Unfortunately, since ImageNet is a closed standard no one can upload such pre-made subsets, forcing everybody to download the full dataset, in ImageNet1k, which is huge!
Contains 1,281,167 images and exactly 1k categories which is why this dataset is also known as ImageNet1k: datascience.stackexchange.com/questions/47458/what-is-the-difference-between-imagenet-and-imagenet1k-how-to-download-it
www.kaggle.com/competitions/imagenet-object-localization-challenge/overview clarifies a bit further how the categories are inter-related according to WordNet relationships:
The 1000 object categories contain both internal nodes and leaf nodes of ImageNet, but do not overlap with each other.
image-net.org/challenges/LSVRC/2012/browse-synsets.php lists all 1k labels with their WordNet IDs.
n02119789: kit fox, Vulpes macrotis
n02100735: English setter
n02096294: Australian terrier
There is a bug on that page however towards the middle:
n03255030: dumbbell
href="ht:
n02102040: English springer, English springer spaniel
and there is one missing label if we ignore that dummy href= line. A thinkg of beauty!
Also the lines are not sorted by synset, if we do then the first three lines are:
n01440764: tench, Tinca tinca
n01443537: goldfish, Carassius auratus
n01484850: great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
gist.github.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57 has lines of type:
n02119789 1 kit_fox
n02100735 2 English_setter
n02110185 3 Siberian_husky
therefore numbered on the exact same order as image-net.org/challenges/LSVRC/2012/browse-synsets.php
gist.github.com/yrevar/942d3a0ac09ec9e5eb3a lists all 1k labels as a plaintext file with their benchmark IDs.
{0: 'tench, Tinca tinca',
 1: 'goldfish, Carassius auratus',
 2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
therefore numbered on sorted order of image-net.org/challenges/LSVRC/2012/browse-synsets.php
The official line numbering in-benchmark-data can be seen at LOC_synset_mapping.txt, e.g. www.kaggle.com/competitions/imagenet-object-localization-challenge/data?select=LOC_synset_mapping.txt
n01440764 tench, Tinca tinca
n01443537 goldfish, Carassius auratus
n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
huggingface.co/datasets/imagenet-1k also has some useful metrics on the split:
ImageNet1k download by Ciro Santilli 40 Updated 2025-07-16
To download from Kaggle, create an API token on kaggle.com, which downloads a kaggle.json file then:
mkdir -p ~/.kaggle
mv ~/down/kaggle.json ~/.kaggle
python3 -m pip install kaggle
kaggle competitions download -c imagenet-object-localization-challenge
The download speed is wildly server/limited and take A LOT of hours. Also, the tool does not seem able to pick up where you stopped last time.
Another download location appears to be: huggingface.co/datasets/imagenet-1k on Hugging Face, but you have to login due to their license terms. Once you login you have a very basic data explorer available: huggingface.co/datasets/imagenet-1k/viewer/default/train.

Pinned article: Introduction to the OurBigBook Project

Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
We have two killer features:
  1. topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculus
    Articles of different users are sorted by upvote within each article page. This feature is a bit like:
    • a Wikipedia where each user can have their own version of each article
    • a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
    This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.
    Figure 1.
    Screenshot of the "Derivative" topic page
    . View it live at: ourbigbook.com/go/topic/derivative
  2. local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:
    This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
    Figure 2.
    You can publish local OurBigBook lightweight markup files to either https://OurBigBook.com or as a static website
    .
    Figure 3.
    Visual Studio Code extension installation
    .
    Figure 4.
    Visual Studio Code extension tree navigation
    .
    Figure 5.
    Web editor
    . You can also edit articles on the Web editor without installing anything locally.
    Video 3.
    Edit locally and publish demo
    . Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.
    Video 4.
    OurBigBook Visual Studio Code extension editing and navigation demo
    . Source.
  3. https://raw.githubusercontent.com/ourbigbook/ourbigbook-media/master/feature/x/hilbert-space-arrow.png
  4. Infinitely deep tables of contents:
    Figure 6.
    Dynamic article tree with infinitely deep table of contents
    .
    Descendant pages can also show up as toplevel e.g.: ourbigbook.com/cirosantilli/chordate-subclade
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact