Algorithmic foundations of neural architecture search

Neural architecture search (NAS)—the problem of selecting which neural model to use for your learning problem—is a promising direction for automating and democratizing machine learning. Early NAS methods achieved impressive results on canonical image classification and language modeling problems, yet these methods were algorithmically complex and massively expensive computationally. More recent heuristics relying on weight-sharing and gradient-based optimization are drastically more computationally efficient while also achieving state-of-the-art performance. However, these heuristics are also complex, are poorly understood, and have recently come under scrutiny because of inconsistent results on new benchmarks and poor performance as a surrogate for fully trained models.

In this talk, we introduce the NAS problem and then present our work studying recent NAS heuristics from first principles. We first perform an extensive ablation study to identify the necessary components of leading NAS methods. We next introduce our geometry-aware framework called GAEA, which exploits the underlying structure of the weight-sharing NAS optimization problem to quickly find high-performance architectures. This leads to simple yet novel algorithms that enjoy faster convergence guarantees than existing gradient-based methods and achieve state-of-the-art accuracy on a wide range of leading NAS benchmarks.

Together, our theory and experiments demonstrate a principled way to co-design optimizers and continuous parameterizations of discrete NAS search spaces.

Speaker Bios

Ameet Talwalkar (opens in new tab) is an assistant professor in the machine learning department at Carnegie Mellon University and is also co-founder and chief scientist at Determined AI. His interests are in the field of statistical machine learning. His current work is motivated by the goal of democratizing machine learning and focuses on topics related to scalability, automation, fairness, and interpretability of learning algorithms and systems.