Benchmarking Modern AI Hardware for Natural Language Processing

Benchmarking Modern AI Hardware for Natural Language Processing

A new generation GPUs and AI accelerators such as IPUs promise massive speedups for NLP. Do they work out in practice?

Since 2018, natural language processing (NLP) has become the fastest growing branch of AI, as well as the most resource-hungry. Consequently, it is important to use the best available hardware for all NLP related computations since even small differences in efficiency amount to massive savings in compute time and electricity in real-world applications. While NVIDIA GPUs have been the workhorse for NLP (and most other AI computations) in the last decade, they are facing increased competition from new AI accelerators such as the Graphcore intelligence processing unit (IPU) Intel Habana, as well as AMD GPUs. In this thesis, we will test relevant applications (BERT and smaller LLMs, both inference and training) on the Simula eX3 experimental hardware platform to gain a better understanding of the attainable performance and look for hardware-specific optimization opportunities.


The goal of this thesis is to get a better understanding which hardware works best for the different workloads through code optimization and extensive benchmarking.


  • Experience with Python
  • Experience with deep learning software such as PyTorch
  • Experience with GPU programming is helpful
  • Experience with NLP applications is very helpful

Associated contacts

Xing Cai

ProfessorHead of departmentChief Research Scientist

Johannes Langguth

Senior Research Scientist