14 million images, more than 20k categories, typically denoting prominent objects in the image, either common daily objects, or a wild range of animals. About 1 million of them also have bounding boxes for the objects.
Each image appears to have a single label associated to it. Care must have been taken somehow with categories, since some images contain severl possible objects, e.g. a person and some object.
In practice however, the ILSVRC subset is more commonly used.
Official project page: www.image-net.org/
The data license is restrictive and forbids commercial usage: www.image-net.org/download.php.
The categories are all part of WordNet, which means that there are several parent/child categories such as dog vs type of dog available. ImageNet1k only appears to have leaf nodes however (i.e. no "dog" label, just specific types of dog).
ImageNet Large Scale Visual Recognition Challenge dataset Updated 2024-12-15 +Created 1970-01-01
Subset of ImageNet. About 167.62 GB in size according to www.kaggle.com/competitions/imagenet-object-localization-challenge/data.
Contains 1,281,167 images and exactly 1k categories which is why this dataset is also known as ImageNet1k: datascience.stackexchange.com/questions/47458/what-is-the-difference-between-imagenet-and-imagenet1k-how-to-download-it
www.kaggle.com/competitions/imagenet-object-localization-challenge/overview clarifies a bit further how the categories are inter-related according to WordNet relationships:
The 1000 object categories contain both internal nodes and leaf nodes of ImageNet, but do not overlap with each other.
image-net.org/challenges/LSVRC/2012/browse-synsets.php lists all 1k labels with their WordNet IDs.There is a bug on that page however towards the middle:and there is one missing label if we ignore that dummy
n02119789: kit fox, Vulpes macrotis
n02100735: English setter
n02096294: Australian terrier
n03255030: dumbbell
href="ht:
n02102040: English springer, English springer spaniel
href=
line. A thinkg of beauty!Also the lines are not sorted by synset, if we do then the first three lines are:
n01440764: tench, Tinca tinca
n01443537: goldfish, Carassius auratus
n01484850: great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
gist.github.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57 has lines of type:therefore numbered on the exact same order as image-net.org/challenges/LSVRC/2012/browse-synsets.php
n02119789 1 kit_fox
n02100735 2 English_setter
n02110185 3 Siberian_husky
gist.github.com/yrevar/942d3a0ac09ec9e5eb3a lists all 1k labels as a plaintext file with their benchmark IDs.therefore numbered on sorted order of image-net.org/challenges/LSVRC/2012/browse-synsets.php
{0: 'tench, Tinca tinca',
1: 'goldfish, Carassius auratus',
2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
The official line numbering in-benchmark-data can be seen at
LOC_synset_mapping.txt
, e.g. www.kaggle.com/competitions/imagenet-object-localization-challenge/data?select=LOC_synset_mapping.txtn01440764 tench, Tinca tinca
n01443537 goldfish, Carassius auratus
n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
huggingface.co/datasets/imagenet-1k also has some useful metrics on the split:
- train: 1,281,167 images, 145.7 GB zipped
- validation: 50,000 images, 6.67 GB zipped
- test: 100,000 images, 13.5 GB zipped