Under review · 2026

From Model Uncertainty to Human Attention:
Localization-Aware Visual Cues for Scalable Annotation Review

  1. Moussa Kassem Sbeyti1,†
  2. Joshua Holstein2,†
  3. Philipp Spitzer2
  4. Nadja Klein1
  5. Gerhard Satzger2

1Scientific Computing Center, Karlsruhe Institute of Technology  ·  2Institute for Information Systems, Karlsruhe Institute of Technology
Equal contribution. Authors are permitted to list their name first.

TL;DR. Surfacing per-box localization uncertainty in the labeling interface redirects annotator attention toward boxes the detector got spatially wrong. Across 120 participants and 1,800 trials, this yielded +0.70 mIoU %-points in label quality and a 7.2% speed-up, with both effects scaling with image difficulty.

Abstract

High-quality labeled data is essential for training robust machine learning models, yet obtaining annotations at scale remains expensive. AI-assisted annotation has therefore become standard in large-scale labeling workflows. However, in tasks where model predictions carry two independent components, a class label and spatial boundaries, a model may classify an object with high confidence while mislocalizing it. Existing AI-assisted workflows offer annotators no signal about where spatial errors are most likely. Without such guidance, humans may systematically underinspect subtly misplaced boxes.

We address this by studying the effect of visualizing spatial uncertainty via a purpose-built interface. In a controlled study with 120 participants, those receiving uncertainty cues achieve higher label quality while being faster overall. A box-level analysis confirms that the cues redirect annotator effort toward high-uncertainty predictions and away from well-localized boxes. These findings establish localization uncertainty as a lever to improve human-in-the-loop annotation.

Interactive viewer

Step through the 97 KITTI images used in the experiment and compare the three annotation layers reported in the paper. Toggle layers on or off, jump between difficulty bins, or use the / keys to navigate.

Loading…
What am I looking at?
  • Original KITTI labels are the boxes that ship with the dataset. They are sometimes imprecise, missing for occluded/distant objects, or mis-labelled.
  • Detector predictions come from the probabilistic EfficientDet-D0 model. Toggle “Colour by uncertainty” to map the per-coordinate aleatoric localization uncertainty onto the box border (blue = certain, red = uncertain).
  • Re-annotated ground truth is the gold standard the three authors produced after independently re-labelling thirds of the experimental pool. It corrects 41 erroneous labels and adds 430 missed annotations.

Paper

A pre-print PDF will be linked here once available. In the meantime please use the BibTeX entry below.

@unpublished{KaSbHoSpNaSa2026,
  title  = {From Model Uncertainty to Human Attention:
            Localization-Aware Visual Cues for Scalable Annotation Review},
  author = {Kassem Sbeyti, Moussa and Holstein, Joshua and Spitzer, Philipp
            and Klein, Nadja and Satzger, Gerhard},
  note   = {Manuscript under review},
  year   = {2026}
}

Data & code

The KITTI subset shipped with this repository is redistributed under the CC BY-NC-SA 3.0 license. For commercial use, please obtain the data directly from the KITTI website. See LICENSE-DATA.md in the repository for the full attribution.