Visual Recognition Models for Scenes of Partial Object Occlusion

dc.contributor.advisor

Collins, Leslie M

dc.contributor.author

Kassaw, Kaleb

dc.date.accessioned

2025-07-02T19:03:34Z

dc.date.available

2025-07-02T19:03:34Z

dc.date.issued

2025

dc.department

Electrical and Computer Engineering

dc.description.abstract

Visual recognition models, including deep neural networks such as convolutional neural networks and Vision Transformers, achieve near-human classification accuracy on general image recognition datasets, but this accuracy has been shown to be greatly reduced when images contain partial occlusion, or coverage of classifiable objects from view of a camera.We create the Image Recognition Under Occlusion (IRUO) dataset and benchmark, a large-scale image benchmark. IRUO contains tens of thousands of images with labeled levels of real partial occlusion, comparisons of both modern general-purpose deep learning models and models with methods to address partial occlusion, and a human study with 20 participants to estimate maximum possible accuracy on occluded images. We find that Vision Transformer-based models outperform convolutional models on occluded images, and existing methods to address occlusion robustness have severe limitations on a large and diverse dataset such as IRUO. However, Vision Transformer models demonstrate limitations compared to humans when classifying images containing diffuse occlusion, or occlusion that is sparse and discontinuous in space. We create efficient prepended embeddings designed to filter out diffuse occlusion, achieving better classification accuracy than base Vision Transformer models on images containing diffuse occlusion, while minimizing computational overhead compared to these models.

dc.identifier.uri

https://hdl.handle.net/10161/32731

dc.rights.uri

https://creativecommons.org/licenses/by-nc-nd/4.0/

dc.subject

Computer engineering

dc.subject

Electrical engineering

dc.subject

Computer science

dc.subject

computer vision

dc.subject

machine learning

dc.subject

occlusion

dc.title

Visual Recognition Models for Scenes of Partial Object Occlusion

dc.type

Dissertation

duke.embargo.months

0.01

duke.embargo.release

2025-07-08

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kassaw_duke_0066D_18441.pdf
Size:
11.93 MB
Format:
Adobe Portable Document Format

Collections