AuthorsK. Pogorelov, D. T. Schroeder, P. Filkukova and J. Langguth
TitleA System for High Performance Mining on GDELT Data
AfilliationScientific Computing
Project(s)UMOD: Understanding and Monitoring Digital Wildfires
StatusPublished
Publication TypeProceedings, refereed
Year of Publication2020
Conference Name2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Date Published05/2020
PublisherIEEE
KeywordsData mining, GDELT, High Performance Computing, Misinformation, Publishing
Abstract

We design a system for efficient in-memory analysis of data from the GDELT database of news events. The specialization of the system allows us to avoid the inefficiencies of existing alternatives, and make full use of modern parallel high-performance computing hardware. We then present a series of experiments showcasing the system’s ability to analyze correlations in the entire GDELT 2.0 database containing more than a billion news items. The results reveal large scale trends in the world of today’s online news.

Citation Key27396

Contact person