Testing Reinforcement Learning

This topic explores testing reinforcement learning agents, their training, and their environments through automatic generation of test scenarios.
Master

Metamorphic testing is a software testing technique that generates new test scenarios from already known tests.
In reinforcement learning, this could mean to confront the RL agent with manipulated versions of previously encountered states to enforce an adjusted reaction. The key for this technique are so-called metamorphic relations that define the transformation of the test input and an expectation of how the expected test outcome should change, even though the exact outcome cannot be precisely predicted. This allows to easily generate new scenarios even without deeply understanding the dynamics of the environment.

The thesis topic requires to first get familiar with the current literature in metamorphic testing and advances in reinforcement learning.
Afterwards, the student decides on a more precise research question for the application of metamorphic testing in reinforcement learning.
Possible directions are:

  • Robustness testing of trained RL agents on manipulated states and environment reactions
  • Testing the training of RL agents on manipulated environments
  • Applying Metamorphic Relations to augment and enhance states during training to improve the sample-efficiency


This is a remote thesis project, i.e. the student does not have to be present at Simula's facilities or enrolled at a Norwegian university.
It is the responsibility of the student to ensure that their university agrees to the cooperation and can eventually issue the degree.

 

Goal

Design and perform an experimental evaluation of metamorphic testing in reinforcement learning.

Learning outcome

  • Metamorphic Testing
  • Reinforcement Learning
  • Designing experiments and experimental evaluation

Qualifications

  • Python programming
  • First experiences in Software Testing or Reinforcement Learning
  • The ability to work in a remote collaboration

Supervisors

Helge Spieker

References

[1] Segura, S., Towey, D., Zhou, Z. Q., & Chen, T. Y. (2018). Metamorphic Testing: Testing the Untestable. IEEE Software, 1–1. doi.org/10.1109/MS.2018.2875968
[2] Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017). Deep Reinforcement Learning: A Brief Survey. IEEE Signal Processing Magazine, 34(6), 26–38. doi.org/10.1109/MSP.2017.2743240
[3] Spieker, H., & Gotlieb, A. (2020). Adaptive metamorphic testing with contextual bandits. Journal of Systems and Software, 165, 110574. doi.org/10.1016/j.jss.2020.110574

Contact person