csperkins.org

Who writes Internet standards documents?

The IETF, as a standards development organisation for the Internet, grew out of the original US government-funded project that developed some of the early internetworking protocols. The organisation has long recognised that this history has biased its participants towards "well-funded, American, white, male technicians, demonstrating a distinctive and challenging group dynamic, both in management and in personal interactions", and it is well understood in the IETF that this lack of diversity is problematic for an organisation that tries to develop the Internet for all its end users. The community has made some efforts to encourage attendance by a broader range of people. Has it succeeded in these attempts to diversify?

The figures below show how the fraction of IETF document authors from different countries (left) and continents (right) has changed over time. We see that North America, while still over represented, is becoming less dominant. 75% of authors were from North America in 2001, but this has declined to 44% in 2020. At the same time, representation of both Europe and Asia has grown, from 17% to 40% and from 6% to 14%, respectively. Africa and South America remain heavily under-represented, with only ~0.5% of authors coming from either continent in 2020.

This data suggests that, if the IETF is to become more geographically representative, further efforts are needed. That said, the declining fraction of attendees from the United States, and the increasing proportion of authors in the "other" category of the country plot, is, perhaps, somewhat encouraging. Authors from a wider range of countries are gradually being represented, although change is slow, and there is a long way to go.

Geographic diversity is an important consideration in ensuring that a range of views are represented, but it is by no means the only concern. Another issue is the influence that particular companies might have on the network. The figure below shows the top 10 affiliations by proportion of RFC authors each year, processed to normalise affiliation names, removing common variations in spelling, and amalgamating known subsidiaries and merged companies. For example, Huawei and Futurewei are combined as Huawei, and Sun Microsystems is merged with Oracle.

We observe several interesting trends. First, Cisco remains a consistent employer of IETF contributors, with around 12% of authors affiliated with the company in 2020, and having been the single largest affiliation across all years in the dataset. We can also see the rise of Huawei beginning in 2005, with 7.1% of authors being affiliated with the company in 2020, having peaked at 9.7% in 2018. Google has a similar trajectory, first appearing in the dataset in 2006, with 3.8% of authors being affiliated with it in 2020.

We also observe the decline of a number of affiliations. Microsoft and Nokia, having peaked with 3.3% and 3.6% of authors, had 0.7% and 1.7% of authors in 2020, respectively, with the absolute number of authors from both companies also declining.

That we are able to observe changing trends in author affiliations, and in particular, that participants from new companies contribute, indicates that the IETF is somewhat open to input from new entrants to the market, although it is clearly heavily dominated by large technology companies.

While we do not demonstrate any causal link, it is encouraging that commercially successful companies opt to enable their employees to actively participate in the IETF. Care is needed to ensure that this relevance is maintained, however: we observe that the author pool has grown less diverse in terms of companies that are represented. 35.4% of authors came from the top 10 affiliations in the dataset in 2020, compared with 25.6% in 2001.

In addition to corporate participation, the IETF has always has a significant fraction of attendees from academia. Indeed, the figure above shows that the fraction of academic attendees has actually increased over time, growing from 8% of authors in 2001, to ~14% in 2020, having peaked at ~17% in 2009. The number of consultants, not shown as a separate category in the figure above, has remained stable, accounting for 2% of authors in 2020.

The figure above breaks down the affiliations of the academics, showing changes in the top 10 academic affiliations over time. In general, academic affiliations are each typically held by a small number of authors. We can see a number of trends in academic authorship, with fewer authors from Columbia University, MIT, and ISI in recent years, and the rise of Tsinghua University and University Carlos III of Madrid.

Finally, we show below the percentage of authors in each year that have not previously authored an RFC. Given that the database we use begins in 2001, 100% of authors are new in that year. The more stable trend in recent years likely highlights the churn in RFC authorship, with around 30% of authors each year having never previously authored an RFC. This shows at least some openness to new participants.

Other aspects of diversity are important, but harder to measure. The IETF has recently started to collect demographic data about attendees, and has made a snapshot of demographic data available as part of the results of a recent community survey, but there is no long-term record of gender diversity, age distribution of participants, English language ability, and so on. We have attempted to estimate historical gender diversity, but tools to infer gender based on participant names are not accurate, so it is difficult to draw meaningful comparisons — although we are continuing to work on this problem. As the IETF collects more data, trends will hopefully become visible over time.

To learn more about this topic, please see our paper Characterising the IETF Through the Lens of RFC Deployment, presented at the ACM Internet Measurement Conference on 2nd November 2021.