The ethical ambiguity of AI data enrichment: Measuring gaps in research ethics norms and practices

Hawkins, Will; Mittelstadt, Brent

doi:10.1145/3593013.3593995.

Computer Science > Computers and Society

arXiv:2306.01800 (cs)

[Submitted on 1 Jun 2023]

Title:The ethical ambiguity of AI data enrichment: Measuring gaps in research ethics norms and practices

Authors:Will Hawkins, Brent Mittelstadt

View PDF

Abstract:The technical progression of artificial intelligence (AI) research has been built on breakthroughs in fields such as computer science, statistics, and mathematics. However, in the past decade AI researchers have increasingly looked to the social sciences, turning to human interactions to solve the challenges of model development. Paying crowdsourcing workers to generate or curate data, or data enrichment, has become indispensable for many areas of AI research, from natural language processing to reinforcement learning from human feedback (RLHF). Other fields that routinely interact with crowdsourcing workers, such as Psychology, have developed common governance requirements and norms to ensure research is undertaken ethically. This study explores how, and to what extent, comparable research ethics requirements and norms have developed for AI research and data enrichment. We focus on the approach taken by two leading conferences: ICLR and NeurIPS, and journal publisher Springer. In a longitudinal study of accepted papers, and via a comparison with Psychology and CHI papers, this work finds that leading AI venues have begun to establish protocols for human data collection, but these are are inconsistently followed by authors. Whilst Psychology papers engaging with crowdsourcing workers frequently disclose ethics reviews, payment data, demographic data and other information, similar disclosures are far less common in leading AI venues despite similar guidance. The work concludes with hypotheses to explain these gaps in research ethics practices and considerations for its implications.

Comments:	10 pages
Subjects:	Computers and Society (cs.CY); Artificial Intelligence (cs.AI)
MSC classes:	N/A
ACM classes:	K.4.1
Cite as:	arXiv:2306.01800 [cs.CY]
	(or arXiv:2306.01800v1 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2306.01800
Journal reference:	2023 ACM Conference on Fairness, Accountability, and Transparency
Related DOI:	https://doi.org/10.1145/3593013.3593995.

Submission history

From: Will Hawkins Mr [view email]
[v1] Thu, 1 Jun 2023 16:12:55 UTC (89 KB)

Computer Science > Computers and Society

Title:The ethical ambiguity of AI data enrichment: Measuring gaps in research ethics norms and practices

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computers and Society

Title:The ethical ambiguity of AI data enrichment: Measuring gaps in research ethics norms and practices

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators