Pinwheel graphic representing the Microsoft Research Summit

Return to Event: Microsoft Research Summit 2021

Microsoft Research Summit 2021 • Videos

Research talk: Breaking the deadly triad with a target network

The deadly triad refers to the instability of an off-policy reinforcement learning (RL) algorithm when it employs function approximation and bootstrapping simultaneously, and this is a major challenge in off-policy RL. Join PhD student Shangtong Zhang, from the WhiRL group at the University of Oxford, to learn how the target network can be used as a tool for theoretically breaking the deadly triad. Together, you’ll explore how to theoretically understand the conventional wisdom that a target network stabilizes training, a novel target network update rule that augments the commonly used Polyak-averaging style update with two projections, and how a target network can be used in linear off-policy RL algorithms, in both prediction and control settings, as well as both discounted and average-reward Markov decision processes.

Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit (opens in new tab)

活动：: Microsoft Research Summit 2021
轨迹:: Reinforcement Learning
日期：: 2021年10月20日
演讲者：: Shangtong Zhang
所属机构：: Oxford University

- Shangtong Zhang
  
  PhD Student
  
  Oxford University
研究领域
- Artificial intelligence
活动
- Microsoft Research Summit 2021

Reinforcement Learning

Opening remarks: Reinforcement Learning
October 20, 2021
Speakers:

Katja Hofmann
Keynote: Key research challenges for real world reinforcement learning
October 20, 2021
Speakers:

John Langford
Research talk: Reinforcement learning with preference feedback
October 20, 2021
Speakers:

Aadirupa Saha
Research talk: Safe reinforcement learning using advantage-based intervention
October 20, 2021
Speakers:

Nolan Wagener
Research talk: Evaluating human-like navigation in 3D video games
October 20, 2021
Speakers:

Raluca Stevenson,

Ida Momennejad
Research talk: Maia Chess: A human-like neural network chess engine
October 20, 2021
Speakers:

Reid McIlroy-Young
Fireside chat: Opportunities and challenges in human-oriented AI
October 20, 2021
Speakers:

Ashley Llorens,

Katja Hofmann,

Siddhartha Sen
Research talk: Making deep reinforcement learning industrially applicable
October 20, 2021
Speakers:

Jiang Bian,

Tie-Yan Liu
Panel: Generalization in reinforcement learning
October 20, 2021
Speakers:

Mingfei Sun,

Roberta Raileanu,

Harm van Seijen

等。
Research talk: Project Dexter: Machine learning and automatic decision-making for robotic manipulation
October 20, 2021
Speakers:

Andrey Kolobov,

Ching-An Cheng
Research talk: Successor feature sets: Generalizing successor representations across policies
October 20, 2021
Speakers:

Kiante Brantley
Research talk: Towards efficient generalization in continual RL using episodic memory
October 20, 2021
Speakers:

Mandana Samiei
Research talk: Breaking the deadly triad with a target network
October 20, 2021
Speakers:

Shangtong Zhang
Panel: The future of reinforcement learning
October 20, 2021
Speakers:

Geoff Gordon,

Emma Brunskill,

Craig Boutilier

等。
Closing remarks: Reinforcement Learning
October 20, 2021
Speakers:

John Langford