Privacy risk assessment in large-scale measurement datasets

Evaluate end-user and business privacy in public Internet measurement datasets.
Master

The project focuses on assessing both end-user and business privacy in Internet measurement data collected through active measurements. Recent studies have shown that the interplay between the internet and local networks can be analyzed to reveal private data and facilitate tracking [1, 2, 3]. One such study investigated how DNS interacts with DHCP and found that some of the data exchanged can be exposed by Reverse DNS (rDNS) queries. Dynamic Host Configuration Protocol (DHCP) is a client/server protocol that automatically assigns IP addresses devices within a network. The client retains the address for a period of time. Reverse DNS lookup is a query for the domain names (text that maps to an IP address) associated with a given IP address. Thus, running such queries through the IPv4 address space can reveal the hostnames mapped to the queried IPs. Linking further this information with the DHCP client data can track end-users.

Goal

Study how privacy is preserved within and across large-scale measurement datasets, and propose methods for improving privacy.

Learning outcome

  • Better understanding of IP routing and Internet architecture
  • Better understanding of data privacy
  • You will get an opportunity to run a real-world large experiment

Qualifications

  • General understanding of IP networks
  • Interest in security
  • Analytical and Programming skills

Supervisors

  • Ioana Alexandrina Livadariu
  • Alfred Arouna

References

[1] van der Toorn et al “Saving Brian’s Privacy: the Perils of Privacy Exposure through Reverse DNS”, ACM Internet Measurement Conference 2022.
[2] Imana et al., “Institutional Privacy Risks in Sharing DNS Data”, Applied Networking Research Workshop 2021.
[3] Mohaisen et al., “Leakage of .onion at the DNS root: Measurements, causes, and countermeasures. IEEE/ACM Transactions on Networking 25, 5 (Oct 2017), 3059–3072.