Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Jinghan Yao, Sam Ade Jacobs, Masahiro Tanaka, Olatunji Ruwase, A. Shafi, H. Subramoni, Dhabaleswar K. Panda
August 2024
Jinghan Yao, Sam Ade Jacobs, Masahiro Tanaka, Olatunji Ruwase, A. Shafi, H. Subramoni, Dhabaleswar K. Panda
August 2024
Xinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang
June 2024
Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, L. Song, Samyam Rajbhandari, Yuxiong He
October 2023
Jinghan Yao, Sam Ade Jacobs, Masahiro Tanaka, Olatunji Ruwase, A. Shafi, H. Subramoni, Dhabaleswar K. Panda
August 2024
Xinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang
June 2024
Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, L. Song, Samyam Rajbhandari, Yuxiong He
October 2023
Jinghan Yao, Sam Ade Jacobs, Masahiro Tanaka, Olatunji Ruwase, A. Shafi, H. Subramoni, Dhabaleswar K. Panda
August 2024
Xinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang
June 2024
Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, L. Song, Samyam Rajbhandari, Yuxiong He
October 2023