HuMaNet 2020: Laconic Image Classification

The goal of laconic image classification is for a model to correctly classify an image with the smallest amount of information (entropy) possible. We have compared four machine models and humans to see which can classify images with the least amount of information. We consider four types of information reduction: crop, colour, resolution, and all three combined. Below you can browse the minimal images.

Download all images (.zip / for non-commercial/academic use only)

Minimal-Entropy Positive Images

For state-of-the-art image classification models, we extract "minimal entropy positive images" (aka simply "minimal images"), which are intuitively the smallest images for which the model gives a correct classification. We use PNG-compressed file-sizes as an estimated measure of entropy. We use off-the-shelf models trained on the ILSVRC2012 training set, where minimal images are computed from the corresponding test set. The information in the image is gradually reduced until any further reduction will cause an incorrect classification. We consider a simplified set of 20 classes to further allow for non-expert human classification. Human minimal images were computed in the other direction: starting from a void quality image, the human user could choose either to enhance the image or to guess the label. If the wrong label was guessed that image was skipped and a new (void) image presented. You can browse and download the minimal images above.

HuMaNet 2020: Challenge

The human minimal images shown above serve as a benchmark of the robustness of image classifiers with respect to partial information and also serve as a yardstick to compare human and machine performance for the task. The images are based on the ILSVRC 2012 test dataset. We propose the following simple challenge: using the ILSVRC 2012 training set for training, classify the human images from the above set into the 20 classes shown with the least (top-1) classification error possible.

Category Mapping

The mappings of the simplified 20 classes to the original ImageNet classes is available here in TSV format.

Pre-trained Machine Models Used

The pre-trained deep neural network models are available from the following locations.

Caveats

Sharks are fish. Lions and tigers are cats. The goal is to choose the most specific class applicable.
Some human-generated minimal images may simply be (informed) guesses.

Publication

Javier Carrasco, Aidan Hogan, Jorge Pérez. Laconic Image Classification: Human vs. Machine Performance. Conference on Information and Knowledge Management (CIKM), 2020.
A short presentation of the above paper (for CIKM 2020 online).

Reference

We use images from the ILSVRC challenge:

Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 2015. (* = equal contribution)