Downloads
ReinMax
September 2023
Bridging Discrete and Backpropagation: Straight-Through and Beyond—Guided by our findings, we propose a novel method called ReinMax, which integrates Heun’s Method, a second-order numerical method for solving ODEs, to approximate the gradient. Our method, ReinMax, achieves second-order accuracy without requiring…
imodelsX
November 2022
Scikit-learn friendly library to interpret, and prompt-engineer text datasets using large language models.
Admin-Torch
April 2022
Here, we provide a plug-in-and-play implementation of Admin, which stabilizes previously-diverged Transformer training and achieves better performance, without introducing additional hyper-parameters. The design of Admin is half-precision friendly and can be reparameterized into the original Transformer.
LiST (Lite Self-Training)
October 2021
We present a new method LiST for efficient fine-tuning of large pre-trained language models (PLMs) in few-shot learning settings. LiST significantly improves over recent methods that adopt prompt fine-tuning using two key techniques. The first one is the use of…
Stochastic Mixture-of-Experts
October 2021
This PyTorch package implements Taming Sparsely Activated Transformer with Stochastic Experts.
Focal Transformer
August 2021
This is a codebase for our recently released paper “Focal Self-attention for Local-Global Interactions in Vision Transformers”. It developed a new sparse self-attention mechanism called focal self-attention towards more effective and efficient vision transformers. The goal is the release the…
Microsoft KaggleDBQA Dataset: Realistic Evaluation of Text-to-SQL Parsers
July 2021
Microsoft KaggleDBQA is a cross-domain and complex evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions. It also provides database documentation, which contain rich in-domain knowledge. The nature of obscure and abbreviated column/table names…
Efficient Self-Supervised Vision Transformers (EsViT)
July 2021
This is a research project in exploring self-supervised learning (SSL) for computer vision. It aims to learn general-purpose image features from raw pixels without relying on manual supervisions, and the learned networks serve as the backbone of various downstream tasks.…
SOLOIST
June 2021
This repository showcases building task-oriented bot at scale with handful examples via fine-tuning a pretrained model using SOLOIST framework, and contains the dataset, source code and pre-trained model for the following paper: SOLOIST: Building Task Bots at Scale with Transfer…