News & features
Loading
In the news | Venture Beat
Microsoft’s Differential Transformer cancels attention noise in LLMs
Improving LLMs’ ability to retrieve in-prompt information can impact important applications such as retrieval-augmented generation (RAG) and in-context learning (ICL). Microsoft Research and Tsinghua University researchers have introduced Differential Transformer (Diff Transformer), a new LLM architecture that amplifies attention to relevant context while…