Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing

Yongbo Yu; Fuxun Yu; Zirui Xu; Di Wang; Minjia Zhang; Ang Li; Shawn Bray; Chenchen Liu; Xiang Chen

Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing

Yongbo Yu ,
Fuxun Yu ,
Zirui Xu ,
Di Wang ,
Minjia Zhang ,
Ang Li ,
Shawn Bray ,
Chenchen Liu ,
Xiang Chen

April 2022

Download BibTex

Federated learning has been applied to train different tasks, posing new computation challenges in training, especially when the scenario becomes multi-task. In this paper, we profile the FL multi-task training process at the operator-level to identify and solve the problems in FL multi-task training. Second, we propose a Competitive GPU Resource Sharing method that can efficiently partition GPU resources to improve training efficiency. Third, for the imbalanced data problem in FL with multi-device training, we perform GPU resource partitioning according to the workload of different models. Experiments show that our method can obtain a 2.1 times speedup.