{c}

This is the one used on <MLperf v2.1 ResNet>, likely one of the most popular choices out there.

2017 challenge subset:
* train: 118k images, 18GB
* validation: 5k images, 1GB
* test: 41k images, 6GB


COCO 2017

{c}
{tag=Good}
{wiki}

The most important thing this project provides appears to be the `.onnx` file format, which represents <ANN models>, pre-trained or not.

<Deep learning frameworks> can then output such `.onnx` files for interchangeability and serialization.

Some examples:
* <activatedgeek LeNet-5> produces a trained `.onnx` from <PyTorch>
* <MLperf v2.1 ResNet> can use `.onnx` as a pre-trained model

The cool thing is that <ONNX> can then run <inference> in an uniform manner on a variety of devices without installing the <deep learning framework> used for. It's a bit like having a kind of portable executable. Neat.


ONNX

{c}

* <torchvision ResNet>
* <MLperf v2.1 ResNet> contains a pre-trained <ResNet> <ONNX> at https://zenodo.org/record/4735647/files/resnet50_v1.onnx for its inference benchmark. We've tested it at: <Run MLperf v2.1 ResNet on Imagenette>.


ResNet implementation

{c}

Let's run on this Imagenet10 subset called <Imagenette>.

First ensure that you get the dummy test data run working as per <MLperf v2.1 ResNet>.

Next, in the `imagenette2` directory, first let's create a 224x224 scaled version of the inputs as required by the benchmark at https://mlcommons.org/en/inference-datacenter-21/[]:
``
#!/usr/bin/env bash
rm -rf val224x224
mkdir -p val224x224
for syndir in val/*: do
  syn="$(dirname $syndir)"
  for img in "$syndir"/*; do
    convert "$img" -resize 224x224 "val224x224/$syn/$(basename "$img")"
  done
done
``
and then let's create the `val_map.txt` file to match the format expected by MLPerf:
``
#!/usr/bin/env bash
wget https://gist.githubusercontent.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57/raw/aa66dd9dbf6b56649fa3fab83659b2acbf3cbfd1/map_clsloc.txt
i=0
rm -f val_map.txt
while IFS="" read -r p || [ -n "$p" ]; do
  synset="$(printf '%s\n' "$p" | cut -d ' ' -f1)"
  if [ -d "val224x224/$synset" ]; then
    for f in "val224x224/$synset/"*; do
      echo "$f $i" >> val_map.txt
    done
  fi
  i=$((i + 1))
done < <( sort map_clsloc.txt )
``
then back on the mlperf directory we download our model:
``
wget https://zenodo.org/record/4735647/files/resnet50_v1.onnx
``
and finally run!
``
DATA_DIR=/mnt/sda3/data/imagenet/imagenette2 time ./run_local.sh onnxruntime resnet50 cpu --accuracy
``
which gives on <Ciro Santilli's hardware/P51>:
``
TestScenario.SingleStream qps=164.06, mean=0.0267, time=23.924, acc=87.134%, queries=3925, tiles=50.0:0.0264,80.0:0.0275,90.0:0.0287,95.0:0.0306,99.0:0.0401,99.9:0.0464
``
where `qps` presumably means "querries per second". And the `time` results:
``
446.78user 33.97system 2:47.51elapsed 286%CPU (0avgtext+0avgdata 964728maxresident)k
``
The `time=23.924` is much smaller than the `time` executable because of some lengthy pre-loading (TODO not sure what that means) that gets done every time:
``
INFO:imagenet:loaded 3925 images, cache=0, took=52.6sec
INFO:main:starting TestScenario.SingleStream
``

Let's try on the <GPU> now:
``
DATA_DIR=/mnt/sda3/data/imagenet/imagenette2 time ./run_local.sh onnxruntime resnet50 gpu --accuracy
``
which gives:
``
TestScenario.SingleStream qps=130.91, mean=0.0287, time=29.983, acc=90.395%, queries=3925, tiles=50.0:0.0265,80.0:0.0285,90.0:0.0405,95.0:0.0425,99.0:0.0490,99.9:0.0512
455.00user 4.96system 1:59.43elapsed 385%CPU (0avgtext+0avgdata 975080maxresident)k
``
TODO lower `qps` on GPU!


Run MLperf v2.1 ResNet on Imagenette

{tag=ResNet}

Instructions at:
* https://github.com/mlcommons/inference/blob/v2.1/vision/classification_and_detection
* https://github.com/mlcommons/inference/blob/v2.1/vision/classification_and_detection/GettingStarted.ipynb

<Ubuntu 22.10> setup with tiny dummy manually generated <ImageNet> and run on <ONNX>:
``
sudo apt install pybind11-dev

git clone https://github.com/mlcommons/inference
cd inference
git checkout v2.1

virtualenv -p python3 .venv
. .venv/bin/activate
pip install numpy==1.24.2 pycocotools==2.0.6 onnxruntime==1.14.1 opencv-python==4.7.0.72 torch==1.13.1

cd loadgen
CFLAGS="-std=c++14" python setup.py develop
cd -

cd vision/classification_and_detection
python setup.py develop
wget -q https://zenodo.org/record/3157894/files/mobilenet_v1_1.0_224.onnx
export MODEL_DIR="$(pwd)"
export EXTRA_OPS='--time 10 --max-latency 0.2'

tools/make_fake_imagenet.sh
DATA_DIR="$(pwd)/fake_imagenet" ./run_local.sh onnxruntime mobilenet cpu --accuracy
``

Last line of output on <Ciro Santilli's hardware/P51>, which appears to contain the benchmark results
``
TestScenario.SingleStream qps=58.85, mean=0.0138, time=0.136, acc=62.500%, queries=8, tiles=50.0:0.0129,80.0:0.0137,90.0:0.0155,95.0:0.0171,99.0:0.0184,99.9:0.0187
``
where presumably `qps` means queries per second, and is the main results we are interested in, the more the better.

Running:
``
tools/make_fake_imagenet.sh
``
produces a tiny <ImageNet subset> with 8 images under `fake_imagenet/`.

`fake_imagenet/val_map.txt` contains:
``
val/800px-Porsche_991_silver_IAA.jpg 817
val/512px-Cacatua_moluccensis_-Cincinnati_Zoo-8a.jpg 89
val/800px-Sardinian_Warbler.jpg 13
val/800px-7weeks_old.JPG 207
val/800px-20180630_Tesla_Model_S_70D_2015_midnight_blue_left_front.jpg 817
val/800px-Welsh_Springer_Spaniel.jpg 156
val/800px-Jammlich_crop.jpg 233
val/782px-Pumiforme.JPG 285
``
where the numbers are the category indices from <ImageNet1k>. At https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a see e.g.:
* 817: 'sports car, sport car',
* 89: 'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',
and so on, so they are coherent with the image names. By quickly looking at the script we see that it just downloads from Wikimedia and manually creates the file.

TODO prepare and test on the actual <ImageNet> validation set, README says:
> Prepare the imagenet dataset to come.

Since that one is undocumented, let's try the <COCO dataset> instead, which uses <COCO 2017> and is also a bit smaller. Note that his is not part of MLperf anymore since v2.1, only <ImageNet> and open images are used. But still:
``
wget https://zenodo.org/record/4735652/files/ssd_mobilenet_v1_coco_2018_01_28.onnx
DATA_DIR_BASE=/mnt/data/coco
export DATA_DIR="${DATADIR_BASE}/val2017-300"
mkdir -p "$DATA_DIR_BASE"
cd "$DATA_DIR_BASE"
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip val2017.zip
unzip annotations_trainval2017.zip
mv annotations val2017
cd -
cd "$(git-toplevel)"
python tools/upscale_coco/upscale_coco.py --inputs "$DATA_DIR_BASE" --outputs "$DATA_DIR" --size 300 300 --format png
cd -
``

Now:
``
./run_local.sh onnxruntime mobilenet cpu --accuracy
``
fails immediately with:
``
No such file or directory: '/path/to/coco/val2017-300/val_map.txt
``
The more plausible looking:
``
./run_local.sh onnxruntime mobilenet cpu --accuracy --dataset coco-300
``
first takes a while to preprocess something most likely, which it does only one, and then fails:
``
Traceback (most recent call last):
  File "/home/ciro/git/inference/vision/classification_and_detection/python/main.py", line 596, in <module>
    main()
  File "/home/ciro/git/inference/vision/classification_and_detection/python/main.py", line 468, in main
    ds = wanted_dataset(data_path=args.dataset_path,
  File "/home/ciro/git/inference/vision/classification_and_detection/python/coco.py", line 115, in __init__
    self.label_list = np.array(self.label_list)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (5000, 2) + inhomogeneous part.
``

TODO!


Ciro Santilli @cirosantilli 40

 Incoming links: MLperf v2.1 ResNet