微软研究院博客
| Han Hu 和 Baining Guo
Early last year, our research team from the Visual Computing Group (opens in new tab) introduced Swin Transformer (opens in new tab), a Transformer-based general-purpose computer vision architecture that for the first time beat convolutional neural networks on the important…
Unlocking new dimensions in image-generation research with Manifold Matching via Metric Learning
| Mengyu Dai 和 Junwon Park
Generative image models offer a unique value by creating new images. Such images can be sharp super-resolution versions of existing images or even realistic-looking synthetic photographs. Generative Adversarial Networks (GANs) and their variants have demonstrated pioneering success with the framework…
| Yale Song
The natural association between visual observations and their corresponding sounds has exhibited powerful self-supervision signals for learning video representations (opens in new tab), which makes the ever-growing amount of online video an attractive data source for self-supervised learning. However, online…
| Daniela Massiceti, Cecily Morrison, Katja Hofmann, 和 Ed Cutrell
Object recognition systems have made spectacular advances in recent years, but they rely on training datasets with thousands of high-quality, labelled examples per object category. Learning new objects from only a few examples could open the door to many new…