banner
2023年4月11日

The Workshop on Understanding and Evaluating Big Models for Human Intelligence and Learning

地点: Virtual

Date: April 11, 2023

Timetable:

China Standard Time (UTC+8) Central European Summer Time (UTC+2) Eastern Daylight Time (UTC-4) Item Speaker
20:00 – 20:05 14:00 – 14:05 08:00 – 08:05 Opening Host:
Xiting Wang, Microsoft Research Asia

Speaker:
Fang Luo, Beijing Normal University
Session 1: Assessing Human and AI Capabilities: Convergence and Integration
20:05 – 20:25 14:05 – 14:25 08:05 – 08:25 Keynote: Integrating the Evaluation of Artificial and Natural Intelligence: Are We Ready Yet?
[slides (opens in new tab), video (opens in new tab)]

Abstract: Despite a long-standing history of psychological theories and methods, the prevailing evaluation approach in AI is still based on observable task performance metrics. Yet this leaderboard culture has led to a crisis of predictability in AI. In this presentation we will briefly cover over two decades of efforts to incorporate ideas from psychology and other disciplines into AI evaluation, such as identifying difficulty levels, leveraging item response theory, conceptualising generality and employing a cognitive approach to infer capabilities.
These methods have been carefully deployed under the umbrella of a universal comparative psychology, championed as the remedy to anthropocentrism in AI evaluation. Today, with the advent of large language models and other foundation models exhibiting a fascinating blend of general-purpose and human-like behaviour, it comes increasingly evident that task-oriented evaluation based on aggregate benchmark performance is not fit for purpose. Are we ready for a paradigm shift?
Can we assess the latest and rarest kinds of intelligence without the populational compass of human intelligence? This presentation will explore the most promising pathways for integrating AI and human evaluation, with a particular focus on a future where humans are augmented by AI.
Host:
Xiting Wang, Microsoft Research Asia

Speaker:
José Hernández-Orallo, Universitat Politècnica de València, Leverhulme Centre for the Future of Intelligence, Centre for the Study of Existential Risk
20:25 – 21:55 14:25 – 15:55 08:25 – 09:55 Panel Discussion: Assessing Human and AI Capabilities: Convergence and Integration

Abstract: We will examine how psychological theories and methods can inform the evaluation and improvement of big models’ performance and robustness, and how the evaluation of big models may provide valuable insights for psychology.
Host:
Xiting Wang, Microsoft Research Asia

Panelists:
  • Jinyan Fan, Auburn University
  • José Hernández-Orallo, Universitat Politècnica de València, Leverhulme Centre for the Future of Intelligence, Centre for the Study of Existential Risk
  • Marija Slavkovik, University of Bergen
  • Clemens Stachl, University of St. Gallen
  • David Stillwell, University of Cambridge
Session 2: The Future of Education with Big Models
22:00 – 22:20 16:00 – 16:20 10:00 – 10:20 Keynote: “Are we there yet?” —The promise of the Holodeck for the future of education
[slides (opens in new tab), video (opens in new tab)]

Abstract: Since its introduction in 1974 in the Star Wars television franchise, visionaries of education dreamed of the virtual and malleable space that can teach, train, assess by immersion, the Holodeck. The recent launches of large multimodal computational models made us all dream of the modern Socratic tutor accessible to all, the immersive experience of learning and training, and of virtual in-situ multimodal assessments. So, are we there yet?
In this presentation I’ll focus on three areas of interest: 1. Construct definition: what and how should we teach in the times of AI? 2. Assessment design & development: what and how to assess the relevant skills in the New World; 3. Social context: what type of guardrails do we need to protect and support students, teachers and integrity of the educational experience.
I will conclude with my personal thoughts on the value of humanity in a fast world of technological prowess and on the dream of the Holodeck.
Host:
Luning Sun, University of Cambridge

Speaker:
Alina A von Davier, Duolingo, EdAstra Tech, University of Oxford, Carnegie Mellon University
22:20 – 23:55 16:20 – 17:55 10:20 – 11:55 Panel Discussion: The Future of Education with Big Models

Abstract: We will address the educational implications of big model development and deployment, including preparing the next generation for a world where AI is ubiquitous and influential.
Host:
Luning Sun, University of Cambridge

Panelists:
  • Xiangen Hu, The University of Memphis, Central China Normal University
  • Yu Lu, Beijing Normal University
  • Bryan Maddox, University of Cambridge, University of East Anglia, University of Oslo, Assessment MicroAnalytics Ltd
  • Alina A von Davier, Duolingo, EdAstra Tech, University of Oxford, Carnegie Mellon University
  • Mengxiao Zhu, University of Science and Technology of China
23:55-24:00 17:55 – 18:00 11:55 – 12:00 Closing Xing Xie, Microsoft Research Asia