Diversity, Equity, Inclusion, and Accessibility in Data

curation
standards
dei
Author

Dan Woulfin

Broadly speaking, data is the product of surveillance or observation collected for a purpose or intervention. Data represents things in the “real world”, whether a person, place, object, organism, idea or etc. and can implicitly or explicitly contain various biases from the “real world.”

Bias in Data

Data is imbued with biases at every stage of a data product’s lifecycle (whether raw data, processed data, a subset of the data, data analysis, or published data). This bias may come from the original purpose of the data collection, the data collectors, or the data subjects. It may come from the population from which the data subjects were chosen and be more macro in nature. It may also come from enhancing the dataset with other data for non-benign purposes. It is for this reason that data curators, librarians and others are concerned with Diversity, Equity, Inclusion and Accessibility (DEIA) and reducing the potential vulnerability of the data subject.

Data DEIA resources

For a short and non-comprehensive list resources on how DEIA has and continues to impact data curation, collection, and management see:

Data Feminism1

This book is a fundamental critical data studies book that looks at data from a gender perspective, exposing various social biases that are or can be implicit in datasets and analytics. Critical data studies is the application of critical theory to data. Critical theory can broadly be described as an academic approach that focuses on reflective assessment and critique of society and culture to reveal and challenge power structures.

The CARE Principles for Indigenous Data Governance2

This is a complementary set of principles to the FAIR Principles, encapsulated by the hashtag #BeFAIRandCARE. It is dedicated to correcting historic power differentials and contexts by ensuring that indigenous peoples maintain sovereignty over their data. CARE consists of four principles: Collective Benefit; Authority to Control; Responsibility; and Ethics.3

The Data Equity Framework4

This framework by WE ALL COUNT5: project for equity in data science seeks to make data products, analysis and research more equitable. WE ALL COUNT sees equity as a continual goal or process and defines data equity “around the principles of fairness, transparency, inclusion and justice regardless of who may be experiencing them.”

The framework breaks down data work into seven stages to find decision points where projects can be made more equitable. These stages are

  1. Funding
  2. Motivation
  3. Project Design
  4. Data Collection & Sourcing
  5. Analysis
  6. Interpretation
  7. Communication and Distribution

See also