Synthetic Data for Object Detection: Improving Robustness and Revealing Vulnerabilities

May 2026

Synthetic Data for Object Detection: Improving Robustness and Revealing Vulnerabilities

Authors:

Xiao Fang

Abstract:

Synthetic data has emerged as a promising solution to the growing challenges of data acquisition in object detection. Modern detectors rely heavily on large-scale annotated datasets, yet collecting real-world data with high-quality labels is often costly, labor-intensive, and impractical in diverse or rare scenarios. By enabling controllable generation with automatic annotations, synthetic data provides a scalable alternative. In this thesis, we investigate its two complementary roles: improving robustness under domain shifts and revealing fundamental vulnerabilities of object detectors.

First, we propose a synthetic data generation framework based on diffusion models to bridge distribution gaps between source and target domains in aerial imagery. By synthesizing high-quality images and corresponding annotations through cross-attention-guided labeling and multi-stage knowledge transfer, our approach significantly improves detection robustness in unseen environments, outperforming supervised learning on source domain data, weakly supervised and unsupervised domain adaptation methods, open-set object detectors, and vision large language models.

Second, we explore the adversarial potential of synthetic data via a controllable imageediting framework for realistic camouflage attacks. By formulating camouflaged adversarial example generation as a conditional image-editing problem, we design image-level and scene-level strategies that produce stealthy, physically plausible camouflages while effectively degrading detector performance. Extensive experiments demonstrate strong attack effectiveness, improved human-perceived stealthiness, and transferability to black-box models, and to the physical world.

Together, these complementary perspectives highlight the dual utility of synthetic data for object detection: as a powerful tool for improving robustness under domain shifts and as a principled lens for uncovering model vulnerabilities.

Notes:

@mastersthesis{Fang-2026-88275,
author = {Xiao Fang},
title = {Synthetic Data for Object Detection: Improving Robustness and Revealing Vulnerabilities},
year = {2026},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-37},
keywords = {Synthetic data, Object detection},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.