下载
ReinMax
2023年9月
Bridging Discrete and Backpropagation: Straight-Through and Beyond—Guided by our findings, we propose a novel method called ReinMax, which integrates Heun’s Method, a second-order numerical method for solving ODEs, to approximate the gradient. Our method, ReinMax, achieves second-order accuracy without requiring…
imodelsX
2022年11月
Scikit-learn friendly library to interpret, and prompt-engineer text datasets using large language models.
Admin-Torch
2022年4月
Here, we provide a plug-in-and-play implementation of Admin, which stabilizes previously-diverged Transformer training and achieves better performance, without introducing additional hyper-parameters. The design of Admin is half-precision friendly and can be reparameterized into the original Transformer.
LiST (Lite Self-Training)
2021年10月
We present a new method LiST for efficient fine-tuning of large pre-trained language models (PLMs) in few-shot learning settings. LiST significantly improves over recent methods that adopt prompt fine-tuning using two key techniques. The first one is the use of…
Stochastic Mixture-of-Experts
2021年10月
This PyTorch package implements Taming Sparsely Activated Transformer with Stochastic Experts.
Focal Transformer
2021年8月
This is a codebase for our recently released paper “Focal Self-attention for Local-Global Interactions in Vision Transformers”. It developed a new sparse self-attention mechanism called focal self-attention towards more effective and efficient vision transformers. The goal is the release the…
Microsoft KaggleDBQA Dataset: Realistic Evaluation of Text-to-SQL Parsers
2021年7月
Microsoft KaggleDBQA is a cross-domain and complex evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions. It also provides database documentation, which contain rich in-domain knowledge. The nature of obscure and abbreviated column/table names…
Efficient Self-Supervised Vision Transformers (EsViT)
2021年7月
This is a research project in exploring self-supervised learning (SSL) for computer vision. It aims to learn general-purpose image features from raw pixels without relying on manual supervisions, and the learned networks serve as the backbone of various downstream tasks.…
SOLOIST
2021年6月
This repository showcases building task-oriented bot at scale with handful examples via fine-tuning a pretrained model using SOLOIST framework, and contains the dataset, source code and pre-trained model for the following paper: SOLOIST: Building Task Bots at Scale with Transfer…