DeepMind Lab by Ciro Santilli 40 Updated 2025-07-16
TODO get one of the games running. Instructions: github.com/deepmind/lab/blob/master/docs/users/build.md. This may helpgithub.com/deepmind/lab/issues/242: "Complete installation script for Ubuntu 20.04".
It is interesting how much overlap some of those have with Ciro's 2D reinforcement learning games
The games are 3D, but most of them are purely flat, and the 3D is just a waste of resources.
Video 1.
Human player test of DMLab-30 Collect Good Objects task by DeepMind (2018)
Source.
Video 2.
Human player test of DMLab-30 Exploit Deferred Effects task by DeepMind (2018)
Source.
Video 3.
Human player test of DMLab-30 Select Described Object task by DeepMind (2018)
Source. Some of their games involve language instructions from the use to determine the desired task, cool concept.
Video 4.
Human player test of DMLab-30 Fixed Large Map task by DeepMind (2018)
Source. They also have some maps with more natural environments.
Robert O'Callahan by Ciro Santilli 40 Updated 2025-07-16
Creator of Mozilla rr, of which Ciro Santilli is a huge fan of!
He quit Mozilla in 2016 to try and commercialize an rr extension called Pernosco.
But as of 2022, he advertised himself as part of "Google Research", so maybe that went under, sample source: archive.ph/o9622. TODO when did he start? There's apparently an unrelated homonym: www.linkedin.com/in/rob-ocallahan/
He's apparently very religious, and very New Zelandish, twitter.com/rocallahan auto-describes:
Christian. Repatriate Kiwi.
Terry A. Davis and D. Richard Hipp come to mind. One is tempted to speculate a correlation even, the proportion amongst systems programmers feels so much higher than in other areas of programming! Maybe it is because you have to be a God to do it in the first place.
Video 1.
Robert O'Callahan interview by Toby Ho (2022)
Source.
Let's run on this Imagenet10 subset called Imagenette.
First ensure that you get the dummy test data run working as per MLperf v2.1 ResNet.
Next, in the imagenette2 directory, first let's create a 224x224 scaled version of the inputs as required by the benchmark at mlcommons.org/en/inference-datacenter-21/:
#!/usr/bin/env bash
rm -rf val224x224
mkdir -p val224x224
for syndir in val/*: do
  syn="$(dirname $syndir)"
  for img in "$syndir"/*; do
    convert "$img" -resize 224x224 "val224x224/$syn/$(basename "$img")"
  done
done
and then let's create the val_map.txt file to match the format expected by MLPerf:
#!/usr/bin/env bash
wget https://gist.githubusercontent.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57/raw/aa66dd9dbf6b56649fa3fab83659b2acbf3cbfd1/map_clsloc.txt
i=0
rm -f val_map.txt
while IFS="" read -r p || [ -n "$p" ]; do
  synset="$(printf '%s\n' "$p" | cut -d ' ' -f1)"
  if [ -d "val224x224/$synset" ]; then
    for f in "val224x224/$synset/"*; do
      echo "$f $i" >> val_map.txt
    done
  fi
  i=$((i + 1))
done < <( sort map_clsloc.txt )
then back on the mlperf directory we download our model:
wget https://zenodo.org/record/4735647/files/resnet50_v1.onnx
and finally run!
DATA_DIR=/mnt/sda3/data/imagenet/imagenette2 time ./run_local.sh onnxruntime resnet50 cpu --accuracy
which gives on P51:
TestScenario.SingleStream qps=164.06, mean=0.0267, time=23.924, acc=87.134%, queries=3925, tiles=50.0:0.0264,80.0:0.0275,90.0:0.0287,95.0:0.0306,99.0:0.0401,99.9:0.0464
where qps presumably means "querries per second". And the time results:
446.78user 33.97system 2:47.51elapsed 286%CPU (0avgtext+0avgdata 964728maxresident)k
The time=23.924 is much smaller than the time executable because of some lengthy pre-loading (TODO not sure what that means) that gets done every time:
INFO:imagenet:loaded 3925 images, cache=0, took=52.6sec
INFO:main:starting TestScenario.SingleStream
Let's try on the GPU now:
DATA_DIR=/mnt/sda3/data/imagenet/imagenette2 time ./run_local.sh onnxruntime resnet50 gpu --accuracy
which gives:
TestScenario.SingleStream qps=130.91, mean=0.0287, time=29.983, acc=90.395%, queries=3925, tiles=50.0:0.0265,80.0:0.0285,90.0:0.0405,95.0:0.0425,99.0:0.0490,99.9:0.0512
455.00user 4.96system 1:59.43elapsed 385%CPU (0avgtext+0avgdata 975080maxresident)k
TODO lower qps on GPU!
MNIST database by Ciro Santilli 40 Updated 2025-07-16
70,000 28x28 grayscale (1 byte per pixel) images of hand-written digits 0-9, i.e. 10 categories. 60k are considered training data, 10k are considered for test data.
Playing with it is the de-facto computer vision hello world.
It was on this dataset that Yann LeCun made great progress with the LeNet model. Running LeNet on MNIST has to be the most classic computer vision thing ever. See e.g. activatedgeek/LeNet-5 for a minimal and modern PyTorch educational implementation.
But it is important to note that as of the 2010's, the benchmark had become too easy for many applications. It is perhaps fair to say that the next big dataset revolution of the same importance was with ImageNet.
The dataset could be downloaded from yann.lecun.com/exdb/mnist/ but as of March 2025 it was down and seems to have broken from time to time randomly, so Wayback Machine to the rescue:
wget \
 https://web.archive.org/web/20120828222752/http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz \
 https://web.archive.org/web/20120828182504/http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz \
 https://web.archive.org/web/20240323235739/http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz \
 https://web.archive.org/web/20240328174015/http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
but doing so is kind of pointless as both files use some crazy single-file custom binary format to store all images and labels. OMG!
Figure 1.
MNIST image 1 of a '0'
.
Figure 2.
MNIST image 21 of a '0'
.
Figure 3.
MNIST image 3 of a '1'
.
Includes its own copy of sqlite3, you don't use the system one, which is good to ensure compatibility. The version is shown at: github.com/mapbox/node-sqlite3/blob/918052b538b0effe6c4a44c74a16b2749c08a0d2/deps/common-sqlite.gypi#L3 SQLite source is tracked compressed in-tree: github.com/mapbox/node-sqlite3/blob/918052b538b0effe6c4a44c74a16b2749c08a0d2/deps/sqlite-autoconf-3360000.tar.gz horrendous. This explains why it takes forever to clone that repository. People who don't believe in git submodules, there's even an official Git mirror at: github.com/sqlite/sqlite
It appears to spawn its own threads via its C extension (since JavaScript is single threaded and and SQLite is not server-based), which allows for parallel queries using multiple threads: github.com/mapbox/node-sqlite3/blob/v5.0.2/src/threading.h
As of 2021, this had slumped back a bit, as maintainers got tired. Unmerged pull requests started piling more, and better-sqlite3 Node.js package started pulling ahead a little.
Cocos2d by Ciro Santilli 40 Updated 2025-07-16
Ciro Santilli considered this as the basis for Ciro's 2D reinforcement learning games, but ultimately decided it was a bit too messy. Nice overall though.

Pinned article: Introduction to the OurBigBook Project

Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
We have two killer features:
  1. topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculus
    Articles of different users are sorted by upvote within each article page. This feature is a bit like:
    • a Wikipedia where each user can have their own version of each article
    • a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
    This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.
    Figure 1.
    Screenshot of the "Derivative" topic page
    . View it live at: ourbigbook.com/go/topic/derivative
  2. local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:
    This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
    Figure 2.
    You can publish local OurBigBook lightweight markup files to either https://OurBigBook.com or as a static website
    .
    Figure 3.
    Visual Studio Code extension installation
    .
    Figure 4.
    Visual Studio Code extension tree navigation
    .
    Figure 5.
    Web editor
    . You can also edit articles on the Web editor without installing anything locally.
    Video 3.
    Edit locally and publish demo
    . Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.
    Video 4.
    OurBigBook Visual Studio Code extension editing and navigation demo
    . Source.
  3. https://raw.githubusercontent.com/ourbigbook/ourbigbook-media/master/feature/x/hilbert-space-arrow.png
  4. Infinitely deep tables of contents:
    Figure 6.
    Dynamic article tree with infinitely deep table of contents
    .
    Descendant pages can also show up as toplevel e.g.: ourbigbook.com/cirosantilli/chordate-subclade
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact