SATYAM: Democratizing Groundtruth for Machine Vision
The democratization of machine learning (ML) has led to ML-based machine vision systems for autonomous
driving, traffic monitoring, and video surveillance. However, true democratization cannot be achieved without
greatly simplifying the process of collecting groundtruth for training and testing these systems. This groundtruth
collection is necessary to ensure good performance under varying conditions. In this paper, we present the
design and evaluation of Satyam, a first-of-its-kind system that enables a layperson to launch groundtruth
collection tasks for machine vision with minimal effort. Satyam leverages a crowdtasking platform, Amazon
Mechanical Turk, and automates several challenging aspects of groundtruth collection: creating and launching
of custom web-UI tasks for obtaining the desired groundtruth, controlling result quality in the face of spammers
and untrained workers, adapting prices to match task complexity, filtering spammers and workers with poor
performance, and processing worker payments. We validate Satyam using several popular benchmark vision data
sets, and demonstrate that groundtruth obtained by Satyam is comparable to that obtained from trained experts
and provides matching ML performance when used for training.