Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Rya Sanovar, Srikant Bharadwaj, Renee St. Amant, Victor Ruehle, Saravan Rajmohan
May 2024
Rya Sanovar, Srikant Bharadwaj, Renee St. Amant, Victor Ruehle, Saravan Rajmohan
May 2024
Srikant Bharadwaj, Shomit Das, K. Mazumdar, Bradford M. Beckmann, Stephen Kosonocky
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | March 2024
Srikant Bharadwaj, Guilherme Cox, Tushar Krishna, Abhishek Bhattacharjee
2018 International Symposium on Microarchitecture | October 2018
Rya Sanovar, Srikant Bharadwaj, Renee St. Amant, Victor Ruehle, Saravan Rajmohan
May 2024
Srikant Bharadwaj, Shomit Das, K. Mazumdar, Bradford M. Beckmann, Stephen Kosonocky
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | March 2024
Srikant Bharadwaj, Guilherme Cox, Tushar Krishna, Abhishek Bhattacharjee
2018 International Symposium on Microarchitecture | October 2018
Srikant Bharadwaj, Shomit Das, K. Mazumdar, Bradford M. Beckmann, Stephen Kosonocky
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems | March 2024
Srikant Bharadwaj, Guilherme Cox, Tushar Krishna, Abhishek Bhattacharjee
2018 International Symposium on Microarchitecture | October 2018