Learning Safety Models for Reinforcement Learning

Safety in Deep Reinforcement Learning is a challenging task, because the learned strategy can not easily be inspected. We want to learn an additional safety model, based on logic constraints.

Deep reinforcement learning (RL) has been successful in many applications, e.g. robotics, game playing, autonomous control. From only the collected rewards as feedback, they train a neural network to pick one of multiple actions for a given state. Safety, however, is an issue, because there are no guarantees that potentially dangerous failure states are actually learned and then avoided.

There has been some work on encapsulating a deep RL agent with additional safety measures, e.g. based on rule systems and constraints, which has shown to be safe and effective for both training and deployment [1]. Still, it requires an expert to manually design and implement these safety rules, which is time consuming, difficult, and sometimes not even possible due to the complexity of the environment.

In this project, we want to learn logic-based safety constraints during the RL agent’s training procedure by using Constraint Acquisition [2] methods. These methods learn a logic-based constraint model from positive and negative examples; the model can then be queried by the RL agent and inspected by a human to inspect the safety-related behaviour of the RL agent.

There are many angles to this project and how it is realized and it is up to the student(s) to decide which route to follow.

It is possible to do the thesis project in a remote setting, i.e. the student does not have to be present at Simula’s offices, but - of course - they can. It is the responsibility of the student to ensure that their university agrees to the cooperation and can eventually issue the degree.


  • Develop a method to learn safety constraints for reinforcement learning agents
  • Implementation & evaluation of the method

Learning outcome

  • Learn about reinforcement learning
  • Learn about constraint programming
  • Learn about constraint acquisition


Programming Skills (preferably Python or Java), First experiences with Machine Learning, Interest in learning


  • Helge Spieker
  • Mohamed Bachir Belaid


[1] Spieker, H. (2021). Constraint-Guided Reinforcement Learning: Augmenting the Agent-Environment-Interaction. International Joint Conference on Neural Networks (IJCNN). arxiv.org/abs/2104.11918

[2] Bessiere, C., Koriche, F., Lazaar, N., & O’Sullivan, B. (2015). Constraint acquisition. Artificial Intelligence, 24, 315–342.

Contact person