Return to Microsoft Research Lab – Redmond

Deep Learning Group

Downloads

ReinMax

September 2023

Bridging Discrete and Backpropagation: Straight-Through and Beyond—Guided by our findings, we propose a novel method called ReinMax, which integrates Heun’s Method, a second-order numerical method for solving ODEs, to approximate the gradient. Our method, ReinMax, achieves second-order accuracy without requiring…

Github

imodelsX

November 2022

Scikit-learn friendly library to interpret, and prompt-engineer text datasets using large language models.

Github

Admin-Torch

April 2022

Here, we provide a plug-in-and-play implementation of Admin, which stabilizes previously-diverged Transformer training and achieves better performance, without introducing additional hyper-parameters. The design of Admin is half-precision friendly and can be reparameterized into the original Transformer.

Github

LiST (Lite Self-Training)

October 2021

We present a new method LiST for efficient fine-tuning of large pre-trained language models (PLMs) in few-shot learning settings. LiST significantly improves over recent methods that adopt prompt fine-tuning using two key techniques. The first one is the use of…

Github

Stochastic Mixture-of-Experts

October 2021

This PyTorch package implements Taming Sparsely Activated Transformer with Stochastic Experts.

Github

Focal Transformer

August 2021

This is a codebase for our recently released paper “Focal Self-attention for Local-Global Interactions in Vision Transformers”. It developed a new sparse self-attention mechanism called focal self-attention towards more effective and efficient vision transformers. The goal is the release the…

Github

Microsoft KaggleDBQA Dataset: Realistic Evaluation of Text-to-SQL Parsers

July 2021

Microsoft KaggleDBQA is a cross-domain and complex evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions. It also provides database documentation, which contain rich in-domain knowledge. The nature of obscure and abbreviated column/table names…

Download

Efficient Self-Supervised Vision Transformers (EsViT)

July 2021

This is a research project in exploring self-supervised learning (SSL) for computer vision. It aims to learn general-purpose image features from raw pixels without relying on manual supervisions, and the learned networks serve as the backbone of various downstream tasks.…

Github

SOLOIST

June 2021

This repository showcases building task-oriented bot at scale with handful examples via fine-tuning a pretrained model using SOLOIST framework, and contains the dataset, source code and pre-trained model for the following paper: SOLOIST: Building Task Bots at Scale with Transfer…

Github