Cartography of the Reddit landscape using graph-based methods

The data from the online social network Reddit can be accessed in its entirety. We have developed a framework to calculate weighted relationships between subreddits. The graph of the relationships between all subreddits is divided into time slices and will be examined in this thesis.

Reddit is organized in topical sub-communities, so-called subreddits. Those communities are primarily user-generated, -moderated, and come in various sizes. Subreddits are independent, meaning there is no direct connection to link or group subreddits together. A user subscribing to a subreddit indicates that the user has an interest in that topic. However, Reddit does not disclose which users subscribe to which subreddits. What is visible is which users commented or posted (contributed) in which subreddit. We believe that these expressions of interest in combination with the interaction in specific subreddits open the possibility to observe how topics evolve over time.


Graph-based analysis of temporal graphs based on Reddit interaction data

Learning outcome

AI & machine learning

Big Data Analysis

Complex Network Analysis


You should be open-minded and able to work in an international team of researchers from different institutions. You should have machine learning skills or at least be very interested in machine learning. Moreover, you should have the ability to plan your work schedule independently. However, the most important requirement is motivation.


  • Johannes Langguth
  • Daniel Thilo Schroeder
  • Prof. Pedro Lind

Contact person