60,000 28x28 grayscale images of hand-written digits 0-9, i.e. 10 categories.
Playing with it is the de-facto computer vision hello world.
But it is important to note that as of the 2010's, the benchmark had become too easy for many application.
The dataset can be downloaded from yann.lecun.com/exdb/mnist/:
wget \
 http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz \
 http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz \
 http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz \
 http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
but doing so is kind of pointless as both files use some crazy single-file custom binary format to store all images and labels. OMG!
Figure 1. MNIST image 1 of a '0'.
Figure 2. MNIST image 21 of a '0'.
Figure 3. MNIST image 3 of a '1'.
Same style as MNIST, but with clothes. Designed to be much harder, and more representative of modern applications, while still retaining the low resolution of MNIST for simplicity of training.