Who writes Internet standards documents?
4 November 2021
/ protocol-standards
The IETF, as a standards development organisation for the Internet,
grew out of the original US government-funded project that developed
some of the early internetworking protocols. The organisation has
long recognised
that this history has biased its participants towards "well-funded,
American, white, male technicians, demonstrating a distinctive and
challenging group dynamic, both in management and in personal
interactions", and it is
well understood in the IETF that this lack of diversity is
problematic for an organisation that tries to develop the Internet
for all its
end users. The community has made some efforts to encourage
attendance by a broader range of people. Has it succeeded in these
attempts to diversify?
The figures below show how the fraction of IETF document authors from
different countries (left) and continents (right) has changed over
time. We see that North America, while still over represented, is
becoming less dominant. 75% of authors were from North America in 2001,
but this has declined to 44% in 2020. At the same time, representation
of both Europe and Asia has grown, from 17% to 40% and from 6% to 14%,
respectively. Africa and South America remain heavily under-represented,
with only ~0.5% of authors coming from either continent in 2020.
This data suggests that, if the IETF is to become more geographically
representative, further efforts are needed. That said, the declining
fraction of attendees from the United States, and the increasing
proportion of authors in the "other" category of the country plot, is,
perhaps, somewhat encouraging. Authors from a wider range of countries
are gradually being represented, although change is slow, and
there is a long way to go.
Geographic diversity is an important consideration in ensuring that a
range of views are represented, but it is by no means the only concern.
Another issue is the influence that particular companies might have on
the network.
The figure below shows the top 10 affiliations by proportion of RFC
authors each year, processed to normalise affiliation names, removing
common variations in spelling, and amalgamating known subsidiaries and
merged companies. For example, Huawei and Futurewei are combined as
Huawei, and Sun Microsystems is merged with Oracle.
We observe several interesting trends. First, Cisco remains a consistent
employer of IETF contributors, with around 12% of authors affiliated
with the company in 2020, and having been the single largest affiliation
across all years in the dataset. We can also see the rise of Huawei
beginning in 2005, with 7.1% of authors being affiliated with the
company in 2020, having peaked at 9.7% in 2018. Google has a similar
trajectory, first appearing in the dataset in 2006, with 3.8% of authors
being affiliated with it in 2020.
We also observe the decline of a number of affiliations. Microsoft and
Nokia, having peaked with 3.3% and 3.6% of authors, had 0.7% and 1.7%
of authors in 2020, respectively, with the absolute number of authors
from both companies also declining.
That we are able to observe changing trends in author affiliations, and
in particular, that participants from new companies contribute, indicates
that the IETF is somewhat open to input from new entrants to the market,
although it is clearly heavily dominated by large technology companies.
While we do not demonstrate any causal link, it is encouraging that
commercially successful companies opt to enable their employees to
actively participate in the IETF. Care is needed to ensure that this
relevance is maintained, however: we observe that the author pool has
grown less diverse in terms of companies that are represented. 35.4% of
authors came from the top 10 affiliations in the dataset in 2020,
compared with 25.6% in 2001.
In addition to corporate participation, the IETF has always has a
significant fraction of attendees from academia. Indeed, the figure
above shows that the fraction of academic attendees has actually
increased over time, growing from 8% of authors in 2001, to ~14% in
2020, having peaked at ~17% in 2009. The number of consultants, not
shown as a separate category in the figure above, has remained stable,
accounting for 2% of authors in 2020.
The figure above breaks down the affiliations of the academics, showing
changes in the top 10 academic affiliations over time. In general,
academic affiliations are each typically held by a small number of
authors. We can see a number of trends in academic authorship, with
fewer authors from Columbia University, MIT, and ISI in recent years,
and the rise of Tsinghua University and University Carlos III of
Madrid.
Finally, we show below the percentage of authors in each year that have
not previously authored an RFC. Given that the database we use begins
in 2001, 100% of authors are new in that year. The more stable trend in
recent years likely highlights the churn in RFC authorship, with around
30% of authors each year having never previously authored an RFC. This
shows at least some openness to new participants.
Other aspects of diversity are important, but harder to measure. The
IETF has recently started to collect demographic data about attendees,
and has made
a snapshot of demographic data available as part of the results of
a recent community survey, but there is no long-term record of gender
diversity, age distribution of participants, English language ability,
and so on. We have attempted to estimate historical gender diversity,
but tools to infer gender based on participant names are not accurate,
so it is difficult to draw meaningful comparisons — although we
are continuing to work on this problem. As the IETF collects more data,
trends will hopefully become visible over time.
To learn more about this topic, please see our paper
Characterising the IETF Through the Lens of RFC Deployment,
presented at the
ACM Internet Measurement Conference on 2nd November 2021.