14 million images, more than 20k categories, typically denoting prominent objects in the image, either common daily objects, or a wild range of animals. About 1 million of them also have bounding boxes for the objects.
Each image appears to have a single label associated to it. Care must have been taken somehow with categories, since some images contain severl possible objects, e.g. a person and some object.
In practice however, the ILSVRC subset is more commonly used.
Official project page: www.image-net.org/
The data license is restrictive and forbids commercial usage: www.image-net.org/download.php.
The categories are all part of WordNet, which means that there are several parent/child categories such as dog vs type of dog available. ImageNet1k only appears to have leaf nodes however (i.e. no "dog" label, just specific types of dog).
Subset generators:
Unfortunately, since ImageNet is a closed standard no one can upload such pre-made subsets, forcing everybody to download the full dataset, in ImageNet1k, which is huge!
An imagenet10 subset by fast.ai.
Size of full sized image version: 1.5 GB.
Contains 1,281,167 images and exactly 1k categories which is why this dataset is also known as ImageNet1k: datascience.stackexchange.com/questions/47458/what-is-the-difference-between-imagenet-and-imagenet1k-how-to-download-it
www.kaggle.com/competitions/imagenet-object-localization-challenge/overview clarifies a bit further how the categories are inter-related according to WordNet relationships:
The 1000 object categories contain both internal nodes and leaf nodes of ImageNet, but do not overlap with each other.
image-net.org/challenges/LSVRC/2012/browse-synsets.php lists all 1k labels with their WordNet IDs.
n02119789: kit fox, Vulpes macrotis
n02100735: English setter
n02096294: Australian terrier
There is a bug on that page however towards the middle:
n03255030: dumbbell
href="ht:
n02102040: English springer, English springer spaniel
and there is one missing label if we ignore that dummy href= line. A thinkg of beauty!
Also the lines are not sorted by synset, if we do then the first three lines are:
n01440764: tench, Tinca tinca
n01443537: goldfish, Carassius auratus
n01484850: great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
gist.github.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57 has lines of type:
n02119789 1 kit_fox
n02100735 2 English_setter
n02110185 3 Siberian_husky
therefore numbered on the exact same order as image-net.org/challenges/LSVRC/2012/browse-synsets.php
gist.github.com/yrevar/942d3a0ac09ec9e5eb3a lists all 1k labels as a plaintext file with their benchmark IDs.
{0: 'tench, Tinca tinca',
 1: 'goldfish, Carassius auratus',
 2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
therefore numbered on sorted order of image-net.org/challenges/LSVRC/2012/browse-synsets.php
The official line numbering in-benchmark-data can be seen at LOC_synset_mapping.txt, e.g. www.kaggle.com/competitions/imagenet-object-localization-challenge/data?select=LOC_synset_mapping.txt
n01440764 tench, Tinca tinca
n01443537 goldfish, Carassius auratus
n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
huggingface.co/datasets/imagenet-1k also has some useful metrics on the split:
  • train: 1,281,167 images, 145.7 GB zipped
  • validation: 50,000 images, 6.67 GB zipped
  • test: 100,000 images, 13.5 GB zipped
The official page: www.image-net.org/challenges/LSVRC/index.php points to a download link on Kaggle: www.kaggle.com/competitions/imagenet-object-localization-challenge/data Kaggle says that the size is 167.62 GB!
To download from Kaggle, create an API token on kaggle.com, which downloads a kaggle.json file then:
mkdir -p ~/.kaggle
mv ~/down/kaggle.json ~/.kaggle
python3 -m pip install kaggle
kaggle competitions download -c imagenet-object-localization-challenge
The download speed is wildly server/limited and take A LOT of hours. Also, the tool does not seem able to pick up where you stopped last time.
Another download location appears to be: huggingface.co/datasets/imagenet-1k on Hugging Face, but you have to login due to their license terms. Once you login you have a very basic data explorer available: huggingface.co/datasets/imagenet-1k/viewer/default/train.