draft-iab-aid-workshop-00.txt   draft-iab-aid-workshop-01.txt 
Network Working Group N. ten Oever Network Working Group N. ten Oever
Internet-Draft University of Amsterdam Internet-Draft University of Amsterdam
Intended status: Informational C. Cath Intended status: Informational C. Cath
Expires: 25 August 2022 Expires: 1 December 2022
M. Kühlewind M. Kühlewind
Ericsson Ericsson
C. S. Perkins C. S. Perkins
University of Glasgow University of Glasgow
21 February 2022 30 May 2022
Report from the IAB Workshop on Analyzing IETF Data (AID), 2021 Report from the IAB Workshop on Analyzing IETF Data (AID), 2021
draft-iab-aid-workshop-00 draft-iab-aid-workshop-01
Abstract Abstract
The 'Show me the numbers: Workshop on Analyzing IETF Data (AID)' was The 'Show me the numbers: Workshop on Analyzing IETF Data (AID)' was
convened by the Internet Architecture Board (IAB) from November 29 to convened by the Internet Architecture Board (IAB) from November 29 to
December 2 and hosted by the IN-SIGHT.it project at the University of December 2 and hosted by the IN-SIGHT.it project at the University of
Amsterdam, however, converted to an online only event. The workshop Amsterdam, however, converted to an online only event. The workshop
was conducted based on two discussion parts and a hackathon activity was conducted based on two discussion parts and a hackathon activity
in between. This report summarizes the workshop's discussion and in between. This report summarizes the workshop's discussion and
identifies topics that warrant future work and consideration. identifies topics that warrant future work and consideration.
skipping to change at page 2, line 10 skipping to change at page 2, line 10
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 25 August 2022. This Internet-Draft will expire on 1 December 2022.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 2, line 47 skipping to change at page 2, line 47
4.1. Tools, data, and methods . . . . . . . . . . . . . . . . 8 4.1. Tools, data, and methods . . . . . . . . . . . . . . . . 8
4.2. Observations on affiliation and industry control . . . . 8 4.2. Observations on affiliation and industry control . . . . 8
4.3. Community and diversity . . . . . . . . . . . . . . . . . 9 4.3. Community and diversity . . . . . . . . . . . . . . . . . 9
4.4. Publications, process, and decision-making . . . . . . . 9 4.4. Publications, process, and decision-making . . . . . . . 9
4.5. Environmental Sustainability . . . . . . . . . . . . . . 10 4.5. Environmental Sustainability . . . . . . . . . . . . . . 10
5. Workshop participants . . . . . . . . . . . . . . . . . . . . 10 5. Workshop participants . . . . . . . . . . . . . . . . . . . . 10
6. Program Committee . . . . . . . . . . . . . . . . . . . . . . 10 6. Program Committee . . . . . . . . . . . . . . . . . . . . . . 10
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10
8. Annexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 8. Annexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.1. Annex 1 - Data Taxonomy . . . . . . . . . . . . . . . . . 11 8.1. Annex 1 - Data Taxonomy . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
The IETF, as an international Standards Developing Organization The IETF, as an international Standards Developing Organization
(SDO), hosts a diverse set of data including on the organization's (SDO), hosts a diverse set of data including on the organization's
history, development, and current standardization activities, history, development, and current standardization activities,
including of Internet protocols and its institutions. A large including of Internet protocols and its institutions. A large
portion of this data is publicly available, yet it is underutilized portion of this data is publicly available, yet it is underutilized
as a tool to inform the work in the IETF proper or the broader as a tool to inform the work in the IETF proper or the broader
research community focused on topics like Internet governance and research community focused on topics like Internet governance and
skipping to change at page 3, line 37 skipping to change at page 3, line 37
The workshop was organized with two all-group discussion slots at the The workshop was organized with two all-group discussion slots at the
beginning and the end of the workshop. In between the workshop beginning and the end of the workshop. In between the workshop
participants organized hacakthon activities, based on topics participants organized hacakthon activities, based on topics
identifed during the initial discussion and submitted position identifed during the initial discussion and submitted position
papers. The follow topic areas have been identified and discussed. papers. The follow topic areas have been identified and discussed.
2.1. Tools, data, and methods 2.1. Tools, data, and methods
The IETF holds a wide range of data sources. The main ones used are The IETF holds a wide range of data sources. The main ones used are
the mailinglist archives, RFCs, and the datatracker. The latter the mailinglist archives (https://mailarchive.ietf.org/arch/), RFCs
provides information on participants, authors, meeting proceedings, (https://www.ietf.org/standards/rfcs/), and the datatracker
minutes and more. Furthermore there are statistics for the IETF (https://datatracker.ietf.org/). The latter provides information on
websites, working group Github repositories, IETF survey data and participants, authors, meeting proceedings, minutes and more
there was discussion about the utility of download statistics for the (https://notes.ietf.org/iab-aid-datatracker-database-overview#).
RFCs itself from different repos. Furthermore there are statistics for the IETF websites
(https://www.ietf.org/policies/web-analytics/), working group Github
repositories, IETF survey data (https://www.ietf.org/blog/ietf-
community-survey-2021/) and there was discussion about the utility of
download statistics for the RFCs itself from different repos.
There are a wide range of tools to analyze this data, produced by There are a wide range of tools to analyze this data, produced by
IETF participants or researchers interestested in the work of the IETF participants or researchers interestested in the work of the
IETF. Two projects that presented their work at the workshop were IETF. Two projects that presented their work at the workshop were
BigBang (https://bigbang-py.readthedocs.io/en/latest/) and BigBang (https://bigbang-py.readthedocs.io/en/latest/) and
Sodestream's IETFdata (https://github.com/glasgow-ipl/ietfdata) Sodestream's IETFdata (https://github.com/glasgow-ipl/ietfdata)
library; the RFC Prolog Database was described in a submitted paper library; the RFC Prolog Database was described in a submitted paper
(see Section Section 4 below). These projects could be used to add (see Section Section 4 below). These projects could be used to add
additional insights to the existing IETF statistics additional insights to the existing IETF statistics
(https://www.arkko.com/tools/docstats.html) page and the datatracker (https://www.arkko.com/tools/docstats.html) page and the datatracker
statistics (https://datatracker.ietf.org/stats/), e.g., related to statistics (https://datatracker.ietf.org/stats/), e.g., related to
gender questions, however, privacy issues andd implication of making gender questions, however, privacy issues andd implication of making
such data publicly available were discussed as well. such data publicly available were discussed as well.
The datatracker itself is a community tool that welcomes The datatracker itself is a community tool that welcomes
contributions, e.g. for additions to the existing interfaces or the contributions, e.g. for additions to the existing interfaces or the
statistics page directly (see https://notes.ietf.org/iab-aid- statistics page directly (see https://notes.ietf.org/iab-aid-
datatracker-database-overview (https://notes.ietf.org/iab-aid- datatracker-database-overview (https://notes.ietf.org/iab-aid-
datatracker-database-overview). Instructions how to set up aa local datatracker-database-overview)). Instructions how to set up a local
development environment can be found, at the time of the workshop, at development environment can be found, at the time of the workshop, at
https://notes.ietf.org/iab-aid-data-resources https://notes.ietf.org/iab-aid-data-resources
(https://notes.ietf.org/iab-aid-data-resources). Questions or any (https://notes.ietf.org/iab-aid-data-resources). Questions or any
discussion can be issued to tools-discuss@ietf.org. discussion can be issued to tools-discuss@ietf.org.
2.2. Observations on affiliation and industry control 2.2. Observations on affiliation and industry control
A large portion of the submitted position papers indicated interest A large portion of the submitted position papers indicated interest
in researching questions about industry control in the in researching questions about industry control in the
standardization process (vs. individual contributions in personal standardization process (vs. individual contributions in personal
skipping to change at page 4, line 40 skipping to change at page 4, line 44
Discussions about the analysis of IETF data shows that affiliation Discussions about the analysis of IETF data shows that affiliation
dynamics are hard to capture, due to the specifics of how the data is dynamics are hard to capture, due to the specifics of how the data is
entered but also because of larger social dynamics. On the side of entered but also because of larger social dynamics. On the side of
IETF data capture, affiliation is an open text field, which causes IETF data capture, affiliation is an open text field, which causes
people to write their affiliation down in different ways people to write their affiliation down in different ways
(capitilization, space, word seperation, etc). A common data format (capitilization, space, word seperation, etc). A common data format
could contribute to analyses that compare SDO performance and could contribute to analyses that compare SDO performance and
behavior of actors inside and across standards bodies. To help this behavior of actors inside and across standards bodies. To help this
a draft data model has been developed during hackathon portion of the a draft data model has been developed during hackathon portion of the
workshop which can found under [Annex A]. workshop which can found as Annex 1 - Data Taxonomy.
Furthermore, there is the issue of mergers and acquisitions and Furthermore, there is the issue of mergers and acquisitions and
subsidiary companies. There is no authorotative exogenous source of subsidiary companies. There is no authorotative exogenous source of
variation for affiliation changes, so hand-collected and curated data variation for affiliation changes, so hand-collected and curated data
is used to analyze changes in affiliation over time. While this is used to analyze changes in affiliation over time. While this
approach is imperfect, conclusions can be drawn from the data. For approach is imperfect, conclusions can be drawn from the data. For
example, in the case of mergers or acquisition where a small example, in the case of mergers or acquisition where a small
organizations joins a large organization, this results in a organizations joins a large organization, this results in a
statistically significant increase in liklihood of an individual statistically significant increase in liklihood of an individual
being put in a working group chair position BaronKanevskaia being put in a working group chair position (see BaronKanevskaia
(https://www.iab.org/wp-content/IAB-uploads/2021/11/Baron.pdf) (https://www.iab.org/wp-content/IAB-uploads/2021/11/Baron.pdf)).
2.3. Community and diversity 2.3. Community and diversity
High interest from the workshop participants was also on using High interest from the workshop participants was also on using
existing data to better understand who the current IETF community is, existing data to better understand who the current IETF community is,
especially in terms of diversity, and how to potentially increase especially in terms of diversity, and how to potentially increase
diversity and thereby inclusivity, e.g. understanding if are there diversity and thereby inclusivity, e.g. understanding if are there
certain groups or lists that "drive people away" and why. certain groups or lists that "drive people away" and why.
Inclusivity and transparency about the standardization process are Inclusivity and transparency about the standardization process are
generally important to keep the Internet and its development process generally important to keep the Internet and its development process
skipping to change at page 6, line 9 skipping to change at page 6, line 9
The human element of the community and diversity was stressed, in The human element of the community and diversity was stressed, in
order to understand the IETF community's diversity it is important to order to understand the IETF community's diversity it is important to
talk to people (beyond text analysis) and in order to ensure talk to people (beyond text analysis) and in order to ensure
inclusivity individual participants must make an effort to, as one inclusivity individual participants must make an effort to, as one
participant recounted, tell them their participation is valuable. participant recounted, tell them their participation is valuable.
2.4. Publications, process, and decision-making 2.4. Publications, process, and decision-making
A number of submissions focussed on the RFC publication process, on A number of submissions focussed on the RFC publication process, on
the development of standards and other RFCs in the IETF, and on how the development of standards and other RFCs in the IETF, and on how
the IETF make decisions. This included work on both technical the IETF makes decisions. This included work on both technical
decisions about the content of the standards, but also procedural and decisions about the content of the standards, but also procedural and
process decisions, and questions around how we can understand, model, process decisions, and questions around how we can understand, model,
and perhaps improve the standards process. Some of the work and perhaps improve the standards process. Some of the work
considered what makes a successful RFC, how are RFCs used and considered what makes a successful RFC, how are RFCs used and
referenced, and about what can we learn by studying the RFCs, drafts, referenced, and what we can learn about importance of a topic by
and email discussion. studying the RFCs, drafts, and email discussion.
There were three sets of questions to consider in this area. The There were three sets of questions to consider in this area. The
first related to success and failure of standards, and considered first related to success and failure of standards, and considered
what makes a successful/good RFC? What makes the process of RFC what makes a successful/good RFC? What makes the process of RFC
making successful? And how are RFCs used and referenced once making successful? And how are RFCs used and referenced once
published? Discussion considered how to better understand the path published? Discussion considered how to better understand the path
from an internet draft to an RFC, to see if there are specific from an internet draft to an RFC, to see if there are specific
factors lead to successful development of a draft into an RFC. factors that lead to successful development of a draft into an RFC.
Participants explored the extent to which this depends on the Participants explored the extent to which this depends on the
seniority and experience of the authors, on the topic and IETF area, seniority and experience of the authors, on the topic and IETF area,
extent and scope of mailing list discussion, and other factors, to extent and scope of mailing list discussion, and other factors, to
understand whether success of a draft can be predicted, and whether understand whether success of a draft can be predicted, and whether
interventions can be developed to increase the likelihood of success interventions can be developed to increase the likelihood of success
for work. for work.
The second question was around decision making. How does the IETF The second question was around decision making. How does the IETF
make design decisions? What are the bottlenecks in effective make design decisions? What are the bottlenecks in effective
decision making? When is a decision made? And what is the decision? decision making? When is a decision made? And what is the decision?
Difficulties here lie in capturing decisions and the results of Difficulties here lie in capturing decisions and the results of
consensus calls early in the process, and understanding the factors consensus calls early in the process, and understanding the factors
that lead to effective decision making. that lead to effective decision making.
Finally, there were questions around what can be learn about Finally, there were questions around what can be learn about
protocols by studying IETF publications, processes, and decision protocols by studying IETF publications, processes, and decision
making? For example, are there insights to be gained around how making? For example, are there insights to be gained around how
security concerns are discussed and considered in the development of security concerns are discussed and considered in the development of
standards? Is it possible to verify correctness of protocols and/or standards? Is it possible to verify correctness of protocols and/or
detect ambiguities? Extract implementations? detect ambiguities? What can be learnt by extracting insights from
implementations and activities on implementation efforts?
Answers to these questions come from analysis of IETF emails, RFCs Answers to these questions come from analysis of IETF emails, RFCs
and Internet-Drafts, meeting minutes, recordings, Github data, and and Internet-Drafts, meeting minutes, recordings, Github data, and
external data such as surveys, etc. external data such as surveys, etc.
2.5. Environmental Sustainability 2.5. Environmental Sustainability
The final discussion session considered environmental sustainability. The final discussion session considered environmental sustainability.
It discussed what is the IETF's role with respect to climate change It discussed what is the IETF's role with respect to climate change
both in terms on what is the environmental impact of the way the IETF both in terms on what is the environmental impact of the way the IETF
develops standards, and in terms of what is the environmental impact develops standards, and in terms of what is the environmental impact
of the standards the IETF develops? of the standards the IETF develops?
Discussion started by considering how sustainable are IETF meetings, Discussion started by considering how sustainable are IETF meetings,
focussing on how much CO2 emissions are IETF meetings responsible for focussing on how much CO2 emissions are IETF meetings responsible for
and how can we make the IETF more sustainable. Analysis looked at and how can we make the IETF more sustainable. Analysis looked at
the home locations of participants, meeting locations, and carbon the home locations of participants, meeting locations, and carbon
footprint of air travel and remote attendance, to estimate the carbon footprint of air travel and remote attendance, to estimate the carbon
costs of an IETF meeting. Initial results suggest that the costs of costs of an IETF meeting. Initial results suggest that the costs of
holding multiple in-person IETF meetings per year are likely holding multiple in-person IETF meetings per year are likely
unsustainable, although the analysis is ongoing. unsustainable in terms of carbon emission, although the analysis is
ongoing.
Discussion also considered to what extent are climate impacts Discussion also considered to what extent are climate impacts
considered in the development and standardization of Internet considered in the development and standardization of Internet
protocols? It reviewed the text of RFCs and active working group protocols? It reviewed the text of RFCs and active working group
drafts, looking for relevant keywords to highlight the extent to drafts, looking for relevant keywords to highlight the extent to
which climate change, energy efficiency, and related topics are which climate change, energy efficiency, and related topics are
considered in the design of Internet protocols, to show the limited considered in the design of Internet protocols, to show the limited
extent to which these topics have been considered. Ongoing work is extent to which these topics have been considered. Ongoing work is
considering meeting minutes and mail archives, to get a fuller considering meeting minutes and mail archives, to get a fuller
picture, but initial results show only limited consideration of these picture, but initial results show only limited consideration of these
skipping to change at page 10, line 37 skipping to change at page 10, line 37
University of Amsterdam), Colin Perkins (chair, IRTF, University of University of Amsterdam), Colin Perkins (chair, IRTF, University of
Glasgow), Corinne Cath (chair, Oxford Internet Institute), Mirja Glasgow), Corinne Cath (chair, Oxford Internet Institute), Mirja
Kuehlewind (IAB, Ericsson), Zhenbin Li (IAB, Huawei), and Wes Kuehlewind (IAB, Ericsson), Zhenbin Li (IAB, Huawei), and Wes
Hardaker (IAB, USC/ISI). Hardaker (IAB, USC/ISI).
7. Acknowledgments 7. Acknowledgments
The Program Committee wishes to extend its thanks to Cindy Morgan for The Program Committee wishes to extend its thanks to Cindy Morgan for
logistics support and to Kate Pundyk for notetaking. logistics support and to Kate Pundyk for notetaking.
This workshop was made possible through funding from the Dutch Efforts put in this workshop by Niels ten Oever was made possible
Research Council (NWO) through grant MVI.19.032 as part of the through funding from the Dutch Research Council (NWO) through grant
programme 'Maatschappelijk Verantwoord Innoveren (MVI)'. MVI.19.032 as part of the programme 'Maatschappelijk Verantwoord
Innoveren (MVI)'.
We would like to thank the Ford Foundation for their support that We would like to thank the Ford Foundation for their support that
made participation of Corinne Cath, Kate Pundyk, and Mallory Knodel made participation of Corinne Cath, Kate Pundyk, and Mallory Knodel
possible (grant number, 136179, 2020). possible (grant number, 136179, 2020).
This work is supported in part by the UK Engineering and Physical Efforts in the organization of this workshop by Niels ten Oever were
Sciences Research Council under grant EP/S036075/1. supported in part by the UK Engineering and Physical Sciences
Research Council under grant EP/S036075/1.
8. Annexes 8. Annexes
8.1. Annex 1 - Data Taxonomy 8.1. Annex 1 - Data Taxonomy
A Draft Data Taxonomy for SDO Data: A Draft Data Taxonomy for SDO Data:
Organization: Organization Subsidiary Time Email domain Website Organization:
domain Size Revenue, annual Number of employees Org - Affiliation Organization Subsidiary
Category (Labels) ; 1 : N Association Advertising Company Chipmaker Time
Content Distribution Network Content Providers Consulting Cloud Email domain
Provider Cybersecurity Financial Institution Hardware vendor Internet Website domain
Registry Infrastructure Company Networking Equipment Vendor Network Size
Service Provider Regional Standards Body Regulatory Body Research and Revenue, annual
Development Institution Software Provider Testing and Certification Number of employees
Telecommunications Provider Satellite Operator Org - Affiliation Category (Labels) ; 1 : N
Association
Advertising Company
Chipmaker
Content Distribution Network
Content Providers
Consulting
Cloud Provider
Cybersecurity
Financial Institution
Hardware vendor
Internet Registry
Infrastructure Company
Networking Equipment Vendor
Network Service Provider
Regional Standards Body
Regulatory Body
Research and Development Institution
Software Provider
Testing and Certification
Telecommunications Provider
Satellite Operator
Org - Stakeholder Group : 1 - 1 Academia Civil Society Private Sector Org - Stakeholder Group : 1 - 1
-- including industry consortia and associations; state-owned and Academia
government-funded businesses Government Technical Community (IETF, Civil Society
ICANN, ETSI, 3GPP, oneM2M, etc) Intergovernmental organization Private Sector -- including industry consortia and associations; state-owned and government-funded businesses
Government
Technical Community (IETF, ICANN, ETSI, 3GPP, oneM2M, etc)
Intergovernmental organization
SDO: Membership Types (SDO) Members (Organizations for some, SDO:
individuals for others...) Membership organization Regional SDO ARIB Membership Types (SDO)
ATIS CCSA ETSI TSDSI TTA TTC Consortia Members (Organizations for some, individuals for others…)
Membership organization
Regional SDO
ARIB
ATIS
CCSA
ETSI
TSDSI
TTA
TTC
Consortia
Country of Origin: Country Code Country of Origin:
Country Code
Number of Participants Number of Participants
Patents Organization Authors - 1 : N - Persons/Participants Time Patents
Region Patent Pool Standard Essential Patent If so, for which Organization
standard Authors - 1 : N - Persons/Participants
Time
Region
Patent Pool
Standard Essential Patent
If so, for which standard
Participant (An individual person) Name 1: N - Emails Time start / Participant (An individual person)
time end Name
1: N - Emails
Time start / time end
1 : N : Affiliation Organization Position Time start / end 1 : N : Affiliation
Organization
Position
Time start / end
1 : N : Affiliation - SDO Position SDO Time 1 : N : Affiliation - SDO
Position
SDO
Time
Email Domain (personal domain) Email Domain (personal domain)
(Contribution data is in other tables) (Contribution data is in other tables)
Document Status of Document Internet Draft Work Item Standard Author Document
- Name Affiliation - Organization Person/Participant (Affiliation Status of Document
from Authors only?) Internet Draft
Data Source - Provenance for any data imported from an external data Work Item
set
Meeting Time Place Agenda Registrations Name Email Affiliation Standard
Author -
Name
Affiliation - Organization
Person/Participant
(Affiliation from Authors only?)
Data Source - Provenance for any data imported from an external data set
Meeting
Time
Place
Agenda
Registrations
Name
Email
Affiliation
Authors' Addresses Authors' Addresses
Niels ten Oever Niels ten Oever
University of Amsterdam University of Amsterdam
Email: mail@nielstenoever.net Email: mail@nielstenoever.net
Corinne Cath Corinne Cath
Email: corinnecath@gmail.com Email: corinnecath@gmail.com
 End of changes. 30 change blocks. 
60 lines changed or deleted 131 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/