Synthetic Data: A Scalable Way to Train Perception Systems (GTC 2020)

June 12, 2020

Check out this talk from Danny Lange, Unity’s VP of AI and Machine Learning, delivered at GTC 2020.

What’s covered in this talk

Visual understanding is a key component of a growing number of automated systems. This visual understanding extends well beyond simple object recognition and into complex areas like semantic segmentation, image classification, and object detection. With one glance at an image, humans can effortlessly imagine the world beyond the pixels. This is a tremendously difficult task for today’s vision systems, requiring higher-order cognition of the world around it.

Machine learned models in industrial applications have to perform well under a variety of realistic conditions such as poor lighting, optical occlusions, and object deformities. For example, a trash picking robot is expected to identify objects that may be found in various spatial orientations and may have undergone random deformations. We have state-of-the-art deep learning techniques (supervised and RL) to solve some of these challenges.

However, these deep learning techniques rely on expensive, hand-annotated data collected from the real world for model training. Rather than building large-scale data collection systems or making learning more efficient, we can take advantage of game engines to generate synthetic data, which is a flexible and affordable alternative to real-world data.

For purposes of illustration, we will take the task of identifying and labeling multiple grocery items in an image. We will apply techniques of domain randomization and demonstrate how you can use Unity Simulation, a scalable cloud simulation platform, to run many instances of Unity in parallel to generate varying synthetic datasets of these grocery items to train a R-CNN model. We will show how this machine learning model trained on synthetic data will outperform the model trained on real world data. We will go a step further by exploring the impact that dataset scale has on model performance and present the economics of scaling dataset generation on Unity Simulation across many instances in the cloud.

Previous Article
Lexus drives next-gen virtual productions with Unity
Lexus drives next-gen virtual productions with Unity

Learn how Lexus is reimagining the production process for digital marketing.

Next Article
A primer on immersive design visualization
A primer on immersive design visualization

Iterate faster, detect design flaws early, and save money with real-time 3D.