Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization
- Hadrien Hendrikx ,
- Lin Xiao (lixiao) ,
- Sébastien Bubeck (sebubeck) ,
- Francis Bach ,
- Laurent Massoulie
MSR-TR-2020-5 |
Published by Microsoft Research
We consider the setting of distributed empirical risk minimization where multiple machines compute the gradients in parallel and a centralized server updates the model parameters. In order to reduce the number of communications required to reach a given accuracy, we propose a preconditioned accelerated gradient method where the preconditioning is done by solving a local optimization problem over a subsampled dataset at the server. The convergence rate of the method depends on the square root of the relative condition number between the global and local loss functions. We estimate the relative condition number for linear prediction models by studying uniform concentration of the Hessians over a bounded domain, which allows us to derive improved convergence rates for existing preconditioned gradient methods and our accelerated method. Experiments on real-world datasets illustrate the benefits of acceleration in the ill-conditioned regime.
Statistical Preconditioning for Distributed Optimization | JRC Workshop 2021
Artificial Intelligence (AI) 20 May 2021 Speaker: Hadrien Hendrikx, INRIA (collaboration with Francis Bach, Laurent Massoulié, INRIA and Sébastien Bubeck, Microsoft) This virtual event brought together the PhD students and postdocs working on collaborative research engagements with Microsoft via the Swiss Joint Research Center, Mixed Reality & AI Zurich Lab, Mixed Reality & AI Cambridge Lab, Inria Joint Center, their academic and Microsoft supervisors as well as the wider research community. The event continued in the tradition of the annual Swiss JRC Workshops. PhD students and postdocs presented project updates and discussed their research with their supervisors and other attendants. In addition, Microsoft speakers provided updates on relevant Microsoft projects and initiatives. There were four event sessions according to research themes: Computer Vision, Systems, and…