Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models
- Tianyi Tang ,
- Wenyang Luo ,
- Haoyang Huang ,
- Dongdong Zhang ,
- Xiaolei Wang ,
- Xin Zhao ,
- Furu Wei ,
- Ji-Rong Wen
ACL 2024 |
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. It remains a challenging problem to explain the underlying mechanisms by which LLMs process multilingual texts. In this paper, we delve into the composition of Transformer architectures in LLMs to pinpoint language-specific regions. Specially, we propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Based on LAPE, we conduct comprehensive experiments on several representative LLMs, such as LLaMA-2, BLOOM, and Mistral. Our findings indicate that LLMs’ proficiency in processing a particular language is predominantly due to a small subset of neurons, primarily situated in the models’ top and bottom layers. Furthermore, we showcase the feasibility to”steer”the output language of LLMs by selectively activating or deactivating language-specific neurons. Our research provides important evidence to the understanding and exploration of the multilingual capabilities of LLMs.