Austin Xu

Title:  HandsOff: Labeled Dataset Generation with No Additional Human Annotations

Time: Friday, April 14th, 3:00 PM
Location: CSIP library (room 5126), 5th floor, Centergy one building

Bio: Austin Xu is a fourth year ECE PhD student advised by Dr. Mark Davenport. His work is focused on studying human data elicitation, specifically how and when humans should be queried to provide feedback, from an empirical and theoretical perspective. He was an Applied Scientist Intern at Amazon during summer 2022 and will be joining Duolingo as an AI Research Intern for summer 2023. Prior to starting his PhD at Georgia Tech, he received his BSE in Electrical Engineering from the University of Michigan.

Abstract: Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation.