Visual Recognition Models for Scenes of Partial Object Occlusion

Loading...

Date

2025

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

0
views
4
downloads

Abstract

Visual recognition models, including deep neural networks such as convolutional neural networks and Vision Transformers, achieve near-human classification accuracy on general image recognition datasets, but this accuracy has been shown to be greatly reduced when images contain partial occlusion, or coverage of classifiable objects from view of a camera.We create the Image Recognition Under Occlusion (IRUO) dataset and benchmark, a large-scale image benchmark. IRUO contains tens of thousands of images with labeled levels of real partial occlusion, comparisons of both modern general-purpose deep learning models and models with methods to address partial occlusion, and a human study with 20 participants to estimate maximum possible accuracy on occluded images. We find that Vision Transformer-based models outperform convolutional models on occluded images, and existing methods to address occlusion robustness have severe limitations on a large and diverse dataset such as IRUO. However, Vision Transformer models demonstrate limitations compared to humans when classifying images containing diffuse occlusion, or occlusion that is sparse and discontinuous in space. We create efficient prepended embeddings designed to filter out diffuse occlusion, achieving better classification accuracy than base Vision Transformer models on images containing diffuse occlusion, while minimizing computational overhead compared to these models.

Description

Provenance

Subjects

Computer engineering, Electrical engineering, Computer science, computer vision, machine learning, occlusion

Citation

Citation

Kassaw, Kaleb (2025). Visual Recognition Models for Scenes of Partial Object Occlusion. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32731.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.