Visual Recognition Models for Scenes of Partial Object Occlusion
| dc.contributor.advisor | Collins, Leslie M | |
| dc.contributor.author | Kassaw, Kaleb | |
| dc.date.accessioned | 2025-07-02T19:03:34Z | |
| dc.date.available | 2025-07-02T19:03:34Z | |
| dc.date.issued | 2025 | |
| dc.department | Electrical and Computer Engineering | |
| dc.description.abstract | Visual recognition models, including deep neural networks such as convolutional neural networks and Vision Transformers, achieve near-human classification accuracy on general image recognition datasets, but this accuracy has been shown to be greatly reduced when images contain partial occlusion, or coverage of classifiable objects from view of a camera.We create the Image Recognition Under Occlusion (IRUO) dataset and benchmark, a large-scale image benchmark. IRUO contains tens of thousands of images with labeled levels of real partial occlusion, comparisons of both modern general-purpose deep learning models and models with methods to address partial occlusion, and a human study with 20 participants to estimate maximum possible accuracy on occluded images. We find that Vision Transformer-based models outperform convolutional models on occluded images, and existing methods to address occlusion robustness have severe limitations on a large and diverse dataset such as IRUO. However, Vision Transformer models demonstrate limitations compared to humans when classifying images containing diffuse occlusion, or occlusion that is sparse and discontinuous in space. We create efficient prepended embeddings designed to filter out diffuse occlusion, achieving better classification accuracy than base Vision Transformer models on images containing diffuse occlusion, while minimizing computational overhead compared to these models. | |
| dc.identifier.uri | ||
| dc.rights.uri | ||
| dc.subject | Computer engineering | |
| dc.subject | Electrical engineering | |
| dc.subject | Computer science | |
| dc.subject | computer vision | |
| dc.subject | machine learning | |
| dc.subject | occlusion | |
| dc.title | Visual Recognition Models for Scenes of Partial Object Occlusion | |
| dc.type | Dissertation | |
| duke.embargo.months | 0.01 | |
| duke.embargo.release | 2025-07-08 |
Files
Original bundle
- Name:
- Kassaw_duke_0066D_18441.pdf
- Size:
- 11.93 MB
- Format:
- Adobe Portable Document Format