Automatic classification of sport news into different categories using state of the art Natural Language Processing (NLP). The project is building upon an existing dataset of Norwegian soccer news.
Natural language processing recently got a great performance boost by the so called BERT model. BERT (Bidirectional Encoder Representations from Transformers) is the current state of the art in NLP and performs very well in almost all tasks within the research community. The Norwegian language has been not very well studied in terms of NLP yet and research in this direction is needed.
The goal of this master project is to built up on an existing soccer news dataset and the BERT model to generate a model than can process Norweigan news. The output of the model should be flexible ranging from multi-class classification to regression. A classification model could be used to classify new into different categories, whereas a regression model could, for example, determine the importance of and article.
- Describe what the master student will learn.
- Work with complex deep neural network models and state of the art NLP
- Python programming
- Knowledge about machine learning is an advantage
- Pål Halvorsen
- Michael Riegler
- Steven Hicks