Return to Microsoft Research Lab – Redmond

Deep Learning Group

下载

ReinMax

2023年9月

Bridging Discrete and Backpropagation: Straight-Through and Beyond—Guided by our findings, we propose a novel method called ReinMax, which integrates Heun’s Method, a second-order numerical method for solving ODEs, to approximate the gradient. Our method, ReinMax, achieves second-order accuracy without requiring…

Github

imodelsX

2022年11月

Scikit-learn friendly library to interpret, and prompt-engineer text datasets using large language models.

Github

Admin-Torch

2022年4月

Here, we provide a plug-in-and-play implementation of Admin, which stabilizes previously-diverged Transformer training and achieves better performance, without introducing additional hyper-parameters. The design of Admin is half-precision friendly and can be reparameterized into the original Transformer.

Github

LiST (Lite Self-Training)

2021年10月

We present a new method LiST for efficient fine-tuning of large pre-trained language models (PLMs) in few-shot learning settings. LiST significantly improves over recent methods that adopt prompt fine-tuning using two key techniques. The first one is the use of…

Github

Stochastic Mixture-of-Experts

2021年10月

This PyTorch package implements Taming Sparsely Activated Transformer with Stochastic Experts.

Github

Focal Transformer

2021年8月

This is a codebase for our recently released paper “Focal Self-attention for Local-Global Interactions in Vision Transformers”. It developed a new sparse self-attention mechanism called focal self-attention towards more effective and efficient vision transformers. The goal is the release the…

Github

Microsoft KaggleDBQA Dataset: Realistic Evaluation of Text-to-SQL Parsers

2021年7月

Microsoft KaggleDBQA is a cross-domain and complex evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions. It also provides database documentation, which contain rich in-domain knowledge. The nature of obscure and abbreviated column/table names…

Download

Efficient Self-Supervised Vision Transformers (EsViT)

2021年7月

This is a research project in exploring self-supervised learning (SSL) for computer vision. It aims to learn general-purpose image features from raw pixels without relying on manual supervisions, and the learned networks serve as the backbone of various downstream tasks.…

Github

SOLOIST

2021年6月

This repository showcases building task-oriented bot at scale with handful examples via fine-tuning a pretrained model using SOLOIST framework, and contains the dataset, source code and pre-trained model for the following paper: SOLOIST: Building Task Bots at Scale with Transfer…

Github