Using Machine Learning to generate Virtual Avatars

Use machine learning to generate an avatar that responds to audio and video input from a person and generates an answer delivered by a virtual avatar with corresponding facial expressions.

The objectives in this task are to protect vulnerable children from abuse, facilitate prosecution of offenders, and ensure that innocent adults are not accused of criminal acts. There is, therefore, a need for improved interviewer training to equip police with the skills to conduct high-quality interviews. To support this important task, we propose to research a training program that utilizes different system components from the field of artificial intelligence such as chatbots, generation of visual content, text-to-speech, and speech-to-text. The goal of combining all these different technologies is to create an immersive and interactive child avatar that responds in a realistic way, to help to support the training of police interviewers.

The system will be trained on 100s of real police interviews for the responses, real videos of persons for the visual avatar and the audio.


A running system that combines multiple machine learning components to create a virtual avatar to give "correct"/reasonable responses, detect mode, show realistic cases with facial expressions.

Learning outcome

  • Deep understanding of machine learning systems
  • Working on a real-world application
  • Collaboration with researchers
  • Possibility to implement and research a novel application
  • Real-world testing


  • Python programming
  • Knowledge about deep learning and video processing is an advantage


  • Pål Halvorsen
  • Michael Riegler
  • Rune Borgli
  • Steven Hicks

Collaboration partners

  • Oslo Metropolitan University
  • Police / Child protective services


See some examples of technologies that should be tested, improved and integrated:

Contact person