Contains several computer vision models, e.g. ResNet, all of them including pre-trained versions on some dataset, which is quite sweet.
Documentation: pytorch.org/vision/stable/index.html
pytorch.org/vision/0.13/models.html has a minimal runnable example adapted to python/pytorch/resnet_demo.py.
That example uses a ResNet pre-trained on the COCO dataset to do some inference, tested on Ubuntu 22.10:
This first downloads the model, which is currently 167 MB.
cd python/pytorch
wget -O resnet_demo_in.jpg https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Rooster_portrait2.jpg/400px-Rooster_portrait2.jpg
./resnet_demo.py resnet_demo_in.jpg resnet_demo_out.jpg
We know it is COCO because of the docs: pytorch.org/vision/0.13/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn_v2.html which explains that
is an alias for:
FasterRCNN_ResNet50_FPN_V2_Weights.DEFAULT
FasterRCNN_ResNet50_FPN_V2_Weights.COCO_V1
The runtime is relatively slow on P51, about 4.7s.
After it finishes, the program prints the recognized classes:
so we get the expected
['bird', 'banana']
bird
, but also the more intriguing banana
.By looking at the output image with bounding boxes, we understand where the banana came from!
Articles by others on the same topic
There are currently no matching articles.