Pinwheel graphic representing the Microsoft Research Summit

Return to Event: Microsoft Research Summit 2021

Microsoft Research Summit 2021 • Videos

Research talk: Safe reinforcement learning using advantage-based intervention

Many sequential decision problems involve finding a policy that maximizes total reward while obeying safety constraints. Although much recent research has focused on the development of safe reinforcement learning (RL) algorithms that produce a safe policy after training, ensuring safety during training as well remains an open problem. A fundamental challenge is performing exploration while still satisfying constraints in an unknown Markov decision process (MDP). In this work, we address this problem for the chance-constrained setting. We propose a new algorithm, SAILR, that uses an intervention mechanism, based on advantage functions, to keep the agent safe throughout training and optimizes the agent’s policy using off-the-shelf RL algorithms designed for unconstrained MDPs. Our method comes with strong guarantees on safety during both training and deployment (that is, after training and without the intervention mechanism) and policy performance compared to the optimal safety-constrained policy. In our experiments, we show that SAILR violates constraints far less during training than standard safe RL and constrained MDP approaches and converges to a well-performing policy that can be deployed safely without intervention.

Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit (opens in new tab)

活动：: Microsoft Research Summit 2021
轨迹:: Reinforcement Learning
日期：: 2021年10月20日
演讲者：: Nolan Wagener
所属机构：: Georgia Tech

- Nolan Wagener
  
  Graduate Student
  
  Georgia Tech
研究领域
- Artificial intelligence
活动
- Microsoft Research Summit 2021

Reinforcement Learning

Opening remarks: Reinforcement Learning
October 20, 2021
Speakers:

Katja Hofmann
Keynote: Key research challenges for real world reinforcement learning
October 20, 2021
Speakers:

John Langford
Research talk: Reinforcement learning with preference feedback
October 20, 2021
Speakers:

Aadirupa Saha
Research talk: Safe reinforcement learning using advantage-based intervention
October 20, 2021
Speakers:

Nolan Wagener
Research talk: Evaluating human-like navigation in 3D video games
October 20, 2021
Speakers:

Raluca Stevenson,

Ida Momennejad
Research talk: Maia Chess: A human-like neural network chess engine
October 20, 2021
Speakers:

Reid McIlroy-Young
Fireside chat: Opportunities and challenges in human-oriented AI
October 20, 2021
Speakers:

Ashley Llorens,

Katja Hofmann,

Siddhartha Sen
Research talk: Making deep reinforcement learning industrially applicable
October 20, 2021
Speakers:

Jiang Bian,

Tie-Yan Liu
Panel: Generalization in reinforcement learning
October 20, 2021
Speakers:

Mingfei Sun,

Roberta Raileanu,

Harm van Seijen

等。
Research talk: Project Dexter: Machine learning and automatic decision-making for robotic manipulation
October 20, 2021
Speakers:

Andrey Kolobov,

Ching-An Cheng
Research talk: Successor feature sets: Generalizing successor representations across policies
October 20, 2021
Speakers:

Kiante Brantley
Research talk: Towards efficient generalization in continual RL using episodic memory
October 20, 2021
Speakers:

Mandana Samiei
Research talk: Breaking the deadly triad with a target network
October 20, 2021
Speakers:

Shangtong Zhang
Panel: The future of reinforcement learning
October 20, 2021
Speakers:

Geoff Gordon,

Emma Brunskill,

Craig Boutilier

等。
Closing remarks: Reinforcement Learning
October 20, 2021
Speakers:

John Langford