a tall building lit up at night

微软亚洲研究院

Shaping the Future with Societal AI: 2024 Microsoft Research Asia StarTrack Scholars Program Highlights AI Ethics and Interdisciplinary Integration

分享这个页面

The rapid development of Generative Pre-trained Transformer (GPT) technologies and the advent of the large model era have significantly impacted every facet of the information world. As AI steps into the complex web of human society, it is transitioning from a mere technological tool to a social entity with significant influence. In the third installment of our exclusive series on the 2024 Microsoft Research Asia StarTrack Scholars Program, we explore the critical role of AI in society, emphasizing the need for AI, as advocated by the Societal AI team at Microsoft Research Asia, to understand and adhere to human societal values. To explore the full scope of the 2024 program, visit our official website: Microsoft Research Asia StarTrack Scholars Program – Microsoft Research.

Over the past year, artificial intelligence has exhibited remarkable advancements, surpassing previously held expectations. Amidst the excitement, a crucial question arises: Is technology itself neutral in terms of values? After all, the intelligence of Large Language Models (LLMs) is based on human-generated corpora, which inevitably are embedded with human biases and values, influencing the reasoning and judgment of machines.

“The rapid development of artificial intelligence is increasingly impacting human society,” said Xing Xie, Senior Principal Research Manager at Microsoft Research Asia. “To ensure that AI evolves as a socially responsible technology, our research is directed towards ‘Societal AI.’ This approach involves interdisciplinary collaboration with social sciences, including psychology, sociology, and law, to explore how AI can understand and adhere to the mainstream values of human society. Our goal is to enable AI to make decisions aligned with human expectations and develop more accurate evaluation models to precisely gauge its actual value orientations and level of intelligence.”

To ensure that AI adheres to the principle of benefiting humanity, Xing Xie and his colleagues at Microsoft Research Asia believe it’s imperative to not only develop technologies aligned with this objective but also to establish rules and methodologies that extend beyond the technological realm. Their area of study involves value orientations as well as AI safety, verifiability, copyright, and model evaluation, which are all closely related to social responsibility.

Preparing for Greater Impact

Years ago, Microsoft identified “Responsible AI” as a core principle in AI research and development, encompassing aspects such as privacy, security, fairness, and explainability. This foresight has become increasingly relevant with AI’s explosive growth over the past year, making Societal AI a forward-looking research direction.

As AI’s capabilities increase and its societal impact expands, even a minor misalignment in its values could potentially trigger significant consequences. As Microsoft President Brad Smith suggests in his book Tools and Weapons: The Promise and the Peril of the Digital Age, the more powerful the tool, the greater the benefit or damage it can cause. Therefore, in pursuing more powerful AI, it is crucial to simultaneously focus on AI’s role in social responsibility and prepare for any potential impacts on human society. The aim of Societal AI is to ensure that AI becomes a technology accountable to society.

Setting Value-Based Guardrails for Artificial Intelligence

Xing Xie and his colleagues believe that in building Societal AI, they should consider the following: value alignment, data and model safety, correctness or verifiability, model evaluation, and interdisciplinary collaboration.

Value alignment, a nascent field, has already gained widespread recognition for its importance in both industry and academia. In simple terms, it means ensuring that AI, when cooperating with humans and society, follows the same mainstream values as humans and achieves goals consistent with human expectations. This approach helps avoid unexpected outcomes from AI automation or the misuse of AI in ways that are detrimental to human welfare. Traditional practices such as reinforcement learning from human feedback (RLHF) are being reevaluated. In Societal AI research, the team’s goal is to elevate AI from merely following human instructions and preferences to embracing basic human values, allowing AI to assess its own actions based on these values. To achieve this, the team has initiated the Value Compass Project, which focuses on directly aligning AI models with human values established in sociology, ethics, and other areas.

pic

According to the team, the challenge they are faced with in this endeavor involves three parts: first, translating abstract human values into concrete, measurable, and practical definitions for AI; second, technically regulating AI behavior with these value definitions; and third, effectively evaluating AI to demonstrate its alignment with genuine human values.

Ensuring AI Remains within Human Oversight

As AI’s intelligence leaps ahead, its evaluation faces new challenges. Traditional task-oriented machine learning allows for quantifiable evaluation standards, but as AI’s work types diversify, new methodologies are needed. To address this, Xing Xie and his team have developed an evaluation route based on the PromptBench architecture, which covers infrastructure, various tasks and scenarios, and evaluation protocols.

pic

In terms of specific evaluation methods, they are exploring two approaches. One is a dynamic and developmental evaluation system. Current static public benchmarks have limitations, such as an inability to accurately evaluate the improving intelligence of large models and running the risk of being fully mastered by them, akin to memorizing a whole exam database. Because developing a dynamic and evolving system is key to achieving fair and accurate AI evaluation, the team has developed the DyVal algorithm for dynamic evaluation of large language models, generating test samples through a directed acyclic graph and allowing for scalable complexity.

The other approach views AI as a general intelligence agent similar to humans and uses methodologies from social sciences such as psychology and education for AI evaluation. The team has initiated interdisciplinary collaboration with experts in psychometrics and believe that the methodologies used for evaluating unique human functions can apply to general AI, offering abilities that traditional benchmarks lack. Their latest paper details the feasibility and potential of psychometrics in AI evaluation.

Cross-Industry and Cross-Disciplinary Collaboration

Just as methodologies from psychology are essential for AI testing, blending Societal AI with other disciplines, especially social sciences, is critical. Key areas such as value alignment, safety, and model evaluation in AI require integration with social sciences, since computer science alone cannot fully address many of the challenges.

Unlike previous interdisciplinary collaborations in computer science, Societal AI presents unique challenges, such as bridging significant disciplinary divides, and requires new approaches. It not only needs to integrate the arts and sciences but also needs to reposition computer technology as an entity that is being empowered rather than one that empowers. Social sciences provide fresh perspectives and tools, necessitating the construction of new theoretical frameworks and methodologies from scratch.

While researchers in engineering, biology, physics, chemistry, and mathematics have begun integrating AI into their studies, there is a significant dearth of talent capable of supporting interdisciplinary research, particularly in social sciences like sociology and law. Balancing and combining the fast-paced, iterative approach of computer science with the long-term research and observational methods of social sciences remains an area of exploration.

In addressing these unresolved and challenging issues, Microsoft Research Asia StarTrack Scholars Program advocates an open attitude, encouraging dialogue and joint experimentation with researchers from various disciplines to discover viable solutions.

As we delve deeper into the realms of Societal AI, we are increasingly recognizing the need for fresh perspectives and innovative minds to tackle the intricate challenges that lie at the convergence of technology and human society. If you are an aspiring young researcher with a zeal for exploring how AI can be made to align with human societal values and are eager to contribute to groundbreaking work in AI safety, verifiability, and value alignment, we invite you to apply to the Microsoft Research Asia StarTrack Scholars Program. Join us in this exciting journey to shape AI into a responsible, value-driven technology that resonates with and enhances human society. Applications are now open for the 2024 program. Apply now and become a part of this transformative journey. For more details and to submit your registration, visit our official website: Microsoft Research Asia StarTrack Scholars Program – Microsoft Research.

References:

1. Yao, J., Yi, X., et al. (2023). “From Instructions to Intrinsic Human Values — A Survey of Alignment Goals for Big Models.” arXiv. Access the paper (opens in new tab)

2. Yi, X., & Xie, X. (2023). “Unpacking the Ethical Value Alignment in Big Models.” Journal of Computer Research and Development, 60(9), 1926-1945. DOI: 10.7544/issn1000-1239.202330553. Access the paper (opens in new tab)

3. Zhu, K., Wang, J., et al. (2023). “PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts.” arXiv. Access the paper (opens in new tab)

4. Microsoft. PromptBench. GitHub repository. Access PromptBench (opens in new tab)

5. Zhu, K., Chen, J., et al. (2023). “DyVal: Graph-informed Dynamic Evaluation of Large Language Models.” arXiv. Access the paper (opens in new tab)

6. Wang, X., Jiang, L., et al. (2023). “Evaluating General-Purpose AI with Psychometrics.” arXiv. Access the paper (opens in new tab)

7. Xie, X. “Aligning AI with human values is as important as making AI intelligent (让AI拥有人类的价值观,和让AI拥有人类智能同样重要).” Microsoft Asia Research Asia WeChat Account, October 26, 2023, 5:02 PM, Beijing. Access the article (opens in new tab)

8. Smith, B. (2023, May 30). Governing AI: A blueprint for our future. In Tools and Weapons Podcast (Season 2, Episode 6). Microsoft News. Access the podcast (opens in new tab)

9. Smith, B., & Browne, C. (2019). Tools and Weapons: The Promise and the Peril of the Digital Age. Penguin Press. Access Microsoft’s introduction to the book (opens in new tab)

10. Microsoft Research Asia. “Intellectual Property, Privacy, and Technology Misuse: How to Face the Legal and Ethical Challenges of the Large Model Era? (知识产权、隐私和技术滥用:如何面对大模型时代的法律与伦理挑战?).” Microsoft Research Asia WeChat Account, August 17, 2023, 5:01 PM, Beijing. Access the article (opens in new tab)

Theme Team:

Xing Xie (Engaging Lead), Senior Principal Research Manager, Microsoft Research Asia

Fangzhao Wu, Principal Researcher, Microsoft Research Asia

Jianxun Lian, Senior Researcher, Microsoft Research Asia

Jindong Wang, Senior Researcher, Microsoft Research Asia

Xiaoyuan Yi, Senior Researcher, Microsoft Research Asia

If you have any questions, please email Ms. Beibei Shi, program manager of the Microsoft Research Asia StarTrack Scholars Program, at [email protected]