Research at Microsoft 2021: Collaborating for real-world change

已发布

Four images highlighting: PIVOT, Ashley Llorens, farming, Race and Technology series

Over the past 30 years, Microsoft Research has undergone a shift in how it approaches innovation, broadening its mission to include not only advancing the state of computing but also using technology to tackle some of the world’s most pressing challenges. That evolution has never been more prominent than it was during this past year.

Recent events underscore the urgent need to address planet-scale problems. Fundamental advancements in science and technology have a crucial role to play in addressing ongoing societal challenges such as climate change, healthcare equity and access, supply chain logistics, sustainability, security and privacy, and the digital divide. Microsoft Research is increasing focus on these areas and others to help accelerate transformational change and build trust in technology as it evolves. However, these challenges are too large for any single organization to meet alone. They require broader and more diverse coalitions across the global science and technology community, including businesses, scholars, governments, nongovernmental organizations, and local communities.

This year, Microsoft Research hosted the first-ever Microsoft Research Summit, a virtual event that embodied our aspiration to catalyze collaboration and innovation across traditional boundaries. The summit brought together experts from around the world—a mix of speakers from Microsoft and external organizations—to critically examine the way technology can increase understanding and further drive advancement; support creativity and achievement; build a resilient, sustainable society; and open healthcare advances to all while maintaining ethical practices that put people first.

Spotlight: blog post

GraphRAG auto-tuning provides rapid adaptation to new domains

GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.

This post explores just some of the work that’s been done this year by Microsoft Research, alongside its partners and collaborators, to drive real-world impact in critical areas, and our aspirations for further impact in the years to come.

Leading the way for real-world impact

Advancing human knowledge and foundational technologies

Fundamental insights into technology and computing can inspire breakthroughs and new computing paradigms while helping to drive scientific discovery forward. In his plenary talk at Research Summit, Peter Lee, Corporate Vice President, Microsoft Research & Incubations, cited “The Usefulness of Useless Knowledge,” an essay published in Harper’s Magazine in 1939 by pioneering educator Abraham Flexner. Among other things, the essay stresses the role that curiosity and exploration play in game-changing technological leaps. It’s at this root of invention and innovation, Flexner argues, where patience and belief in shared knowledge is key.

LAMBDA, one of this year’s first big announcements, shows how the Microsoft research community can make significant contributions to products and customers when given the time and freedom to follow their curiosities. In this case, the product was Microsoft Excel—a program that has benefitted from the efforts of research teams over time. The feature, which resulted from collaboration between members of the Calc Intelligence and Excel teams, gives users the ability to define custom worksheet functions in Excel’s formula language, making the program Turing-complete, that is, allowing any computation to be written in the Excel formula language.

Podcast: Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones

To make networks in data centers more scalable to future needs, researchers in Optics for the Cloud explored how optical circuit switches could replace resource-heavy electrical switches at a network’s core. They demonstrated the system’s potential to switch between wavelengths at nanosecond speeds—a necessity for supporting low-latency networks at the scale required—using a microcomb and semiconductor optical amplifiers.

Picture of tweet

A research team moved the bar forward for DNA storage, introducing a proof-of-concept molecular controller in the form of a tiny DNA storage writing mechanism on a chip. The chip demonstrated the ability to pack DNA-synthesis spots three orders of magnitude more tightly than before and results in much higher DNA writing throughput than current systems.

AI at Scale continued to gain momentum in 2021. With exponential growth this year, large artificial intelligence (AI) models trained using deep learning are one example of fundamental science where applications in the real world are becoming more ubiquitous. Microsoft Research teams were recognized for advancing the state of the art and developing new multilingual capabilities to build more inclusive language technologies using AI as well as pushing the boundaries of natural language processing (NLP) and computer vision.

In June 2021, Microsoft Research’s LReasoner system set a new standard for logical reasoning ability among pretrained language models. It reached the top of the official leaderboard for ReCLor, a dataset built using questions from the LSAT and GMAT, two standardized admissions tests.

Microsoft Turing’s T-ULRv5 achieved breakthrough performance on the XTREME leaderboard in September. A few weeks later, Microsoft Turing’s model T-NLRv5 reached the top of the SuperGLUE and GLUE leaderboards. Ultimately, these benchmarks and respective leaderboards help to measure progress toward creating AI that better understands language and better converses with people within and across language boundaries. To understand how quickly these advances are happening, one need only look to Megatron-Turing NLG, the language generation model with 530 billion parameters trained to convergence—a collaboration between Microsoft and NVIDIA.

chart, line chart
Trend of sizes of state-of-the-art NLP models over time

To train the Megatron-Turing model, DeepSpeed and NVIDIA Megatron-LM paired up to create an efficient and scalable 3D-parallel system harnessing data parallelism, pipeline parallelism, and tensor slicing–based parallelism. Beyond this achievement, the DeepSpeed optimization library added a number of features and tools this year, including DeepSpeed Inference, its first foray into improving inference latency and cost using multiple graphic processing units (GPUs). The team also introduced DeepSpeed MoE, supporting five types of parallelism and training 8x larger models when compared with existing systems. Zero-Infinity allowed for scaling of large model training from one to thousands of GPUs, furthering its effort to democratize model training for everyone.

These large AI models are impressive in their own right, but it’s what they’re able to do to support people and democratize innovation that makes them especially valuable. Advances in language technologies resulted in the expansion of Microsoft translation and spelling-correction technologies into over 100 languages, breaking down language barriers in products like Microsoft Bing and Microsoft Translator. The Microsoft Turing Team also introduced Turing Bletchley, a 2.5 billion-parameter Universal Image Language Representation model (T-UILR) that can perform image-language tasks in 94 languages.

Meanwhile, researchers from Microsoft Research Asia worked on bridging the gap between computer-language and computer-vision modeling. In October, members of the Visual Computing group won the Conference on Computer Vision (ICCV) 2021 award for their paper on the Swin Transformer. This vision transformer surpasses the state of the art with its high performance, flexibility, and linear complexity, making it compatible with a broad range of vision tasks. With this work, the research team hopes to inspire additional research in this area that will ultimately enable joint modeling between the computer vision and language domains. Research teams in the same lab examined the potential for transformers to find success beyond language and vision, demonstrating the neural network architecture can be applied to graph representation learning. With their standard transformer architecture Graphormer, the researchers achieved state-of-the-art performance in the KDD Cup 2021 graph-level prediction track and topped popular graph-level prediction leaderboards.

Amplifying human creativity and achievement

People are multidimensional, pursuing goals and tasks across different areas of their lives, under a variety of circumstances. Microsoft researchers are dedicated to not only helping individuals accomplish more in their personal, professional, and creative lives, but also to helping them feel more confident doing so.

Over the past year and a half, researchers and product teams throughout Microsoft have responded swiftly to workplace challenges and opportunities arising from the pandemic. Supporting organizations in executing hybrid work models, they explored technology as an intermediary between people who are physically in the room and those who are not, with some researchers presenting their findings in a hybrid meeting prototype during Research Summit. Researchers investigated remote and hybrid work from a variety of angles—from longitudinal studies on multitasking behavior to workplace communication insights gleaned using network machine learning—to understand where technology needs to grow to help people thrive under these fluid working conditions. Previous and ongoing work in this area is captured by the New Future of Work Initiative and the annual Work Trend Index.

As ML techniques and approaches advance, so does the potential for applications to empower individuals in the workplace and beyond does, too. Research teams are leveraging few-shot learning to help support AI that is truly more customizable to the individual with the ORBIT dataset and benchmark. The dataset and benchmark are inspired by a real-world application for people who are blind or have low vision called teachable object recognizers. The dataset strives to reflect the variance within object types and input quality that recognition systems will encounter in the day-to-day, while the benchmark challenges models to identify objects for single users from a few, high-variation examples.

Earlier in the year, at the CHI 2021 Conference on Human Factors in Computing Systems, researchers presented tools and learnings guided by a changing definition of accessibility, one that focuses on helping individuals rise above limitations imposed by a world built to accommodate the majority to realize their full capabilities. Also, members of the Enable Group explored the continuing evolution of Soundscape, an app that uses 3D spatial audio to elevate users’ perception of an environment they’re navigating.

Technology has the ability to not only empower at an individual level but also at a systemic level, emboldening people with tools and support to pursue and achieve large-scale positive change in their communities and society as a whole.

Microsoft Research India’s Center for Societal Impact through Cloud and Artificial Intelligence (SCAI) (opens in new tab) was established to extend the lab’s research and technologies to create impact across domains such as healthcare and sustainability. SCAI collaborates with social enterprises to augment their ability to make a difference, as has been the case with Respirer Living Sciences. This year saw SCAI and the climate science startup integrate Microsoft Research India’s Dependable IoT solution with Respirer’s PM2.5 sensors for monitoring kits that are providing real-time air-quality data that is easier to access and understand. Project Amplify (opens in new tab), a collaboration between the India lab, Microsoft for Startups, and Accenture, serves as another channel for SCAI to share its resources, helping startups committed to addressing societal and sustainability challenges. In its first year (2020–2021), the initiative supported work in aquaculture, agriculture, and mental health.

Meanwhile, the Research for Industry (RFI) initiative connects researchers and industry partners to help a variety of domains—from retail and financial services to energy and entertainment—operate in a dynamic world. The value of such collaboration can be seen by the work already being done in agriculture, where individual farmers are able to preview a suite of technologies developed to bring together low-bandwidth wireless technology, micro-climate prediction using deep learning, and data analysis in Microsoft Azure to improve crop returns and sustainability.

Fostering a resilient and sustainable society

In May 2021, Microsoft Research introduced a new societal resilience research (opens in new tab) agenda. Inspired in part by the rapid development of COVID-19 vaccines and a rising tide of global challenges, it advocates a “reset” between science and society. As we continue to pursue foundational academic advances, we must also accelerate research that addresses societal challenges (opens in new tab)

Societal Resilience deploys open, adaptable technologies to enable community-oriented, collective problem solving. It drives new tools to help domain experts translate real-world data into evidence (see figure below). This means collaborating across traditional boundaries and engaging people who live where the challenges exist. A new video series looks more closely at the changing nature of innovation and discovering new ways to build resilience.

The Synthetic Data Showcase tool (opens in new tab) helps nontechnical domain experts use data to respond to human trafficking and exploitation. This tool, also demoed at Research Summit (opens in new tab), uses Power BI to support the CTDC global dataset (opens in new tab) on victims of trafficking while safeguarding victims’ privacy. Microsoft is a founding member of Tech Against Trafficking (TAT)—a coalition fighting human trafficking with technology.

chart: CTDC global dataset on victims of trafficking
In this example, we use Power BI to support privacy-preserving exploration of the anonymous datasets generated by our Synthetic Data Showcase tool. Having selected the records of victims in the age range 9–17, we can see the distributions of multiple additional attributes contained in these records: the year the victim was registered, gender, country of citizenship and exploitation, and type of labor or sexual exploitation. All of the counts in these distributions are dynamically generated by Power BI filtering and aggregating records of the synthetic dataset. These “estimated” counts are compared on the right with “actual” counts precomputed over the sensitive data, showing that the synthetic dataset accurately captures the structure of the sensitive data for the selected age range. For these victims aged 9–17, the association with “typeOfLabourOther” indicates a potential need to expand the data schema to support more targeted policy design tackling forced labor of children.

Environmental sustainability is a key focus area for Microsoft Research. We support our customers’ efforts to reduce carbon emissions, including our work with Project Zerix, which combines biotechnology, chemistry, and materials science with computer science and engineering to develop more environmentally sustainable materials for the IT industry. 

Project Eclipse is an example of local collaboration toward building sustainable and resilient cities. It’s the largest real-time, hyperlocal air-quality sensing network in a North American city. The Urban Innovation team worked with local partners in Chicago to deploy over 100 low-cost air pollution sensors across the city. The team provided several updates this year, including a new video showing how the system works, plus a demo and related presentation at Research Summit. 

Supporting a healthy global society

Technology is driving amazing progress in human health, exemplified by the unprecedented development of testing, vaccines, and treatments for COVID-19. It’s important to make those treatments and therapies available as broadly as possible. Microsoft Research supports inclusive and equitable technologies and studies that improve scientific discovery, along with better, more equitable delivery of health care of people everywhere.

In 2021, as the pandemic raged, it hit some populations harder than others, including people with limited access to or experience with technology. To lower barriers to health information, Microsoft Research developed the Covid-19 Vaccine Eligibility Bot (opens in new tab) to help people understand their eligibility to receive COVID vaccinations. The bot is accessible across a range of communication channels and in local languages to serve non-English speakers. Separately, a new partnership (opens in new tab) including Broad Institute of MIT and Harvard, Verily, and Microsoft will provide cloud, data and AI technology, and access to its global network of more than 168,000 health and life science partners, to help researchers interpret an unprecedented amount of biomedical data to advance the treatment of human diseases through the open-source platform Terra.

As with any health-related technology, it’s important that new medical AI applications adhere to privacy regulations that safeguard sensitive data. Microsoft Research India developed a framework for secure, privacy-preserving, and AI-enabled medical imaging inference using CrypTFlow2 (opens in new tab), a state-of-the-art end-to-end compiler allowing cryptographically secure two-party computation (2PC) protocols. CrypTFlow2 may allow developers without cryptography experience to build efficient and scalable multi-party computation (MPC) protocols for inference tasks, dramatically improving health providers’ ability to process and analyze sensitive data while respecting privacy.

Deep learning and open-source strategies can improve cancer radiotherapy workflows and care. But the learning models are not easily accessible to researchers and care providers. This webinar provides an update on Project InnerEye, which aims to democratize AI for medical image analysis and empower health professionals to build medical imaging AI models.

Project InnerEye: person reviewing scans on a monitor

Microsoft Research continued to research and develop new technologies to improve healthcare and access for all. Two examples from 2021 include:

Ensuring that technology is trustworthy and beneficial to everyone

Fully realizing the value of technology requires the trust of those it’s intended to help. And trust is earned, which is why developing AI responsibly is a key tenet of the Microsoft mission. Researchers at the company are guided by the principles of fairness, reliability and safety, inclusiveness, and transparency, among others, in their pursuit of advancement, and they build and share resources and tools to incorporate those principles into research and development. Those tools include the Responsible AI (RAI) Toolbox and the Human AI eXperience (HAX) Toolkit. Combining error analysis, model interpretability, fairness, counterfactual example analysis, and causal analysis tools, the RAI Toolbox provides practitioners with the means to understand model behavior, identify and address issues, and support real-world decision-making, while HAX provides actionable resources for prioritizing the safety and needs of people throughout the development of human-AI experiences.

Understanding the benefits and harms of language models has been of particular interest to the broader research community. In discussing the topic at Research Summit, researchers explored limitations of current task framing in building tech that meets real needs and called for interdisciplinary methods to more effectively study real-world impact. Microsoft researchers also covered a variety of other topics related to cultivating trust in AI systems, including executing responsible AI and identifying, assessing, and mitigating harms in AI systems, as part of the Microsoft Research Webinar series. Microsoft Research also launched a monthly lecture series examining the relationship between technology and the perception of race and its ramifications (for more information, see the section below).

RESTler—the first stateful REST API fuzzer—can help efficiently find security and reliability bugs in cloud services. RESTler analyzes a Swagger/OpenAPI specification and produces a fuzzing grammar that contains information about requests and their dependencies. RESTler only fuzzes a request if all its dependent resources have been successfully created—this enables RESTler to achieve deeper coverage out of the box. RESTler also offers a pluggable model for checking security properties. RESTler is open source and available at its GitHub repository (opens in new tab).

New complexity in AI systems and an increased reliance on using data to develop and train those systems brings with it increased requirements for keeping those systems secure. This year, researchers at Microsoft started the Privacy Preserving Machine Learning (opens in new tab) (PPML) initiative to address the need for preserving individual data privacy (opens in new tab) throughout the ML pipeline. Differential privacy (opens in new tab) (DP) is one technique that plays an important role in this initiative. Microsoft Research is pushing the boundaries in DP research with the overarching goal of providing Microsoft customers with the best possible productivity experiences through improved ML models for NLP while providing highly robust privacy protections.

Graphic shows framework of Privacy Preserving Machine Learning

Cryptography and privacy researchers are committed to protecting the confidentiality of people’s data, and in January, they collaborated with the Microsoft Edge product team to introduce Password Monitor for Microsoft Edge, a security feature that notifies users if any of their saved passwords has been found in a third-party breach. The underlying technology uses homomorphic encryption, a technique pioneered at Microsoft Research, which ensures the privacy and security of users’ passwords.

Watch For, a technology incubated in Microsoft Research, has been helping to create safe and inclusive digital spaces for Xbox and other platforms. The real-time media content analytics platform traces its beginnings back to the 2017 internal hackathon hosted by Microsoft and has garnered attention as the engine behind HypeZone, gaming’s version of NFL RedZone. Watch For is now officially a part of Xbox Family, Trust, and Safety and will continue to help support content moderation and online safety throughout Microsoft.


Engaging with the broader research community and looking to the future

Microsoft Research values its ties to the academic community, and it continued to support research and learning through its fellowship and grant opportunities in 2021.

  • The Microsoft Research PhD Fellowship was awarded to 40+ recipients around the world in 2021. The fellowship seeks to empower the next generation of exceptional computing talent in order to build a stronger and inclusive computing-related research community.
  • The Microsoft Research Dissertation Grant provides research funding for doctoral students who are underrepresented in the field of computing, with the goal of increasing the pipeline of diverse talent receiving advanced degrees in computing. In their work, this year’s grant recipients explore technology applications for accessibility, healthcare, entrepreneurship, digital literacy, and other areas.
  • The Microsoft Research Faculty Fellowship recognizes innovative, promising new faculty, whose work and talent identifies them as emerging leaders in their fields. The 2021 Faculty Fellows’ work ranges from investigating new methods in cryptography to applications of signal processing and ML in biomedicine.

This year, Microsoft Research made changes that enabled our growth and created new opportunities. In January, Ashley Llorens joined Microsoft Research as VP, Distinguished Scientist, and Managing Director of Microsoft Research Outreach. He’s helping Microsoft Research achieve its mission of amplifying the impact of research at Microsoft and advance the cause of science and technology research worldwide. Before joining Microsoft, Llorens served as founding chief of the Intelligent Systems Center at the Johns Hopkins Applied Physics Laboratory.

In July, Microsoft Research announced the addition of a new satellite research lab in Amsterdam. Building on work being pursued at Microsoft Research Cambridge and Microsoft Research Asia in a larger research effort at Microsoft, the Amsterdam lab will focus on molecular simulation using ML. Distinguished scientist and renowned ML researcher Max Welling will lead the lab, bringing a deep background in physics and quantum computing to the role. By using compute power to run physical simulations, he and the lab’s growing team hope to help Microsoft further explore the application of ML to molecular science and uncover its tremendous potential in tackling some of the most important challenges facing society, including climate change, drug discovery, and understanding biology to help treat disease.

To confront challenges through research and technology, we’re sometimes required to engage in new ways. This year, Microsoft Research started the Race and Technology lecture series, a virtual speaker series designed to foster understanding of, and inspire continued research into, how the perception of race influences technology and vice versa through the work of distinguished academics and domain experts. This series continues through June 2022.

Micah Stampley, Lisa Nakamura posing for a photo
Race and Technology: A Research Lecture Series features 14 distinguished scholars and domain experts from a diverse range of research areas and disciplines. From top left: Dr. Sareeta Amrute, Dr. Kim TallBear, Dr. Charlton McIlwain, Dr. Ruha Benjamin, Dr. Lisa Nakamura, Dr. Simone Browne, and Dr. André Brock. From bottom left: Dr. Sohini Ramachandran, Dr. C. Brandon Ogbunu, Dr. Kishonna L. Gray, Dr. Desmond Upton Patton, Merisa Heu-Weller, J.D., Dr. Denae Ford Robinson, and Dr. A. Nicki Washington.

In 2021, Microsoft Research broadened its engagement with the larger global research community, and the events we hosted and attended this year provided new and enriching opportunities to engage with our community of researchers. Highlights include the ACM Special Interest Group on Data Communication (SIGCOMM) 2021 in August, where Microsoft was a gold sponsor, and the 35th Annual Conference on Neural Information Processing System (NeurIPS 2021) in December. Visit our event page for the full list of events and conferences in which Microsoft participated.

2021 Awards
Over the years, the scientific community has recognized the outstanding and pioneering work done by Microsoft researchers. Here are some highlights of the awards received in 2021:

Susan Dumais elected into ACM SIGIR Academy

Abi Sellen elected a Fellow of the Royal Society

Ranveer Chandra included in Newsweek’s inaugural list of America’s Greatest Disruptors

Sébastien Bubeck awarded Outstanding Paper Award at NeurIPS 2021

Neeraj Kayal awarded Infosys 2021 Prize for Mathematical Sciences

Explore the index of awards recognizing Microsoft researchers’ contributions in 2021 on our News and Awards page.

For 30 years, Microsoft Research has invested in rigorous scientific research and ambitious long-term thinking. We have made a lot of progress on both foundational and real-world challenges, but our work is not done. In the coming year, we’ll continue to build on the foundation we’ve developed and focus on creating solutions that drive long-term real-world impact, and ultimately help to create a more resilient, sustainable, and healthy global society. We look forward to the breakthroughs that can make that happen.

Hear from generations of Microsoft researchers as they reflect on the past and look ahead to the future at Microsoft Research. Explore the 30th Anniversary Generations of Inspirational and Impactful Research panel series (opens in new tab) to learn more.

To stay up to date on all things research at Microsoft, follow our (opens in new tab) blog (opens in new tab) and subscribe to our (opens in new tab) newsletter (opens in new tab) and the Microsoft Research Podcast (opens in new tab). You can also follow us on Facebook, (opens in new tab)Twitter, (opens in new tab)YouTube, (opens in new tab) and Instagram (opens in new tab).

继续阅读

查看所有博客文章