Synthetic Imagery Data Generation for Training and Testing Deep Learning Models of Object Recognition

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



A well-known challenge of modern high-capacity recognition models, such as deep neural networks, is their need for large quantities of real-world data collected for training and testing purposes. One potential solution to this problem is the use of synthetic data, either collected from a virtual world or generated from neural network models. The use of synthetic data to train or test recognition models has grown rapidly in recent years, and it has been found to be effective for a variety of tasks. My research investigates the utility of synthetic data for training and testing deep learning models using three classes of design strategies: i) image stylization, ii) domain randomization and iii) meta learning.My first set of work focuses on utilization of synthetic data generated from image stylization methods for testing of autonomous driving perception systems. Autonomous driving perception systems require robust testing that includes many abnormal conditions (e.g., severe weather conditions) to ensure safety. Such necessary validation requires large amounts of testing data collected under abnormal conditions, i.e., those that rarely occur, which is expensive to obtain in practice. Industry guiding the development of autonomous driving systems has collected large quantities of images obtained under normal driving conditions after years of on-road testing, however, the datasets of images that have been obtained under abnormal conditions are still insufficient for accurate validation of the driving algorithms. In my work, generative adversarial networks are trained to stylize the existing real-world images to reflect abnormal conditions. Novel adversarial training techniques are proposed to reduce the sim-to-real gap and ensure the imagery fidelity. The stylized synthetic images are applied to the testing of object detection models for autonomous vehicles under abnormal conditions, e.g., heavy fog density and low light intensity. The experimental results reveal the capability of generative adversarial networks to generate high-fidelity synthetic images with important characteristics of images collected in the abnormal conditions, similar to real images. The proposed data synthesis method can benefit car companies by reducing costs and time associated with the necessary testing under abnormal conditions. A domain randomization method is proposed in my second set of work for building detection in overhead imagery applications, where the available real-world data is extremely insufficient. The visual characteristics of overhead imagery vary tremendously due to numerous factors: imaging conditions, environmental conditions, and geographic location. To train a robust recognition model, a large set of training data in the target domain is required, however, it is extremely expensive to obtain such data in real applications, due to the cost of imagery, and the required manual pixel-wise labeling of the imagery. In my work, an efficient framework is proposed for automatic generation of large quantities of synthetic overhead imagery from virtual world simulators with randomized content and design parameters. Then, it is assumed that the simulation design space is large enough that the target domain is likely to be a subset of the simulation design space, borrowing the key hypothesis of domain randomization. The synthetic images are used for augmenting the training of building detection models in overhead imagery. The experimental results show that virtual world simulators have the capability to generate synthetic images with important characteristics of real images and benefit the training of building detection models, which confirms the hypothesis of domain randomization. To continue this path of research, I next investigated meta learning methods in my third effort for a more efficient data synthesis approach. Since domain randomization methods typically incur a long training period and lead to models that are too conservative, I propose a meta learning-based method, called Neural-Adjoint Meta-Simulation (NAMS), to learn how to accurately generate synthetic data in the target domain from virtual world simulators. Fast design parameter inferences of target data are learned based on a novel Neural-Adjoint technique which makes the optimization differentiable. Synthetic data from the target domain can then be efficiently generated based on the fast inferred design parameters. I show in several synthetic and real-world experiments that, by using NAMS, synthetic images with the target design can be rapidly and accurately generated for the training of building detection algorithms. The proposed method is amortized: after an upfront computational cost at the learning stage, parameters of new target data can be inferred efficiently. To my knowledge, this is the first work in the data synthesis area that emphasizes the amortization property, for an important fact in real applications, that is, the number of unique target domains are typically large.





Yu, Handi (2022). Synthetic Imagery Data Generation for Training and Testing Deep Learning Models of Object Recognition. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.