Stephen McQuistin, Peter Snyder, Colin Perkins, Hamed Haddadi, and Gareth Tyson
Proceedings of the ACM Internet Measurement Conference,
October 2023.
DOI:10.1145/3618257.3624836
The public suffix list is a community-maintained list of rules that can
be applied to domain names to determine how they should be grouped into
logical organizations or companies. We present the first large-scale
measurement study of how the public suffix list is used by open-source
software on the Web and the privacy harm resulting from projects using
outdated versions of the list. We measure how often developers include
out-of-date versions of the public suffix list in their projects, how
old included lists are, and estimate the real-world privacy harm with a
model based on a large-scale crawl of the Web. We find that incorrect
use of the public suffix list is common in open-source software, and
that at least 43 open-source projects use hard-coded, outdated versions
of the public suffix list. These include popular, security-focused
projects, such as password managers and digital forensics tools. We
also estimate that, because of these out-of-date lists, these projects
make incorrect privacy decisions for 1313 effective top-level
domains (eTLDs), affecting 50,750 domains, by extrapolating from data
gathered by the HTTP Archive project.
Download: mcquistin2023psl.pdf