Explainable Reinforcement Learning
In recent years, artificial intelligence has made significant strides in various fields, reshaping the landscape of technology and innovation. One of the key factors driving this progress is the emergence of reinforcement learning (RL) which enables autonomous agents to make decisions and adapt to their surroundings. RL has been very successful in games and other applications. In general, an RL agent aims to learn a near-optimal policy to achieve a fixed objective by taking action and receiving feedback through rewards and observations from the environment. A neural network (NN) commonly represents the policy that, given the observation of the environment state as input, yields values that indicate which action to choose.
However, the complexity of NN layers in a policy, introduced by the vast connections between the neurons, often obscures key insights, making it unclear which parts of the network influence decision-making. This underlines the need for more research on interpretable RL methods. Addressing this need will facilitate safer decision-making processes and contribute to the responsible advancement of RL.
Goal
- Review methods to explain and interpret RL policies.
- Compare different explainability methods with each other.
- (Potentially coming up with an own approach.)
Learning outcome
- Better understanding of RL
- Learn or improve your knowledge about technologies such as PyTorch
Qualifications
- Python
- PyTorch or Tensorflow
- RL basics
Supervisors
- Dennis Groß
- Helge Spieker