View the Project on GitHub Indiiigo/participatory-compsocsci-documentation

Participatory Development of Quality Guidelines for Social Media Research

A Structured, Hands-on Design Workshop at the International Conference for Web and Social Media (ICWSM), taking place virtually on 7th June, 2021 at 2 PM BST.

What’s this workshop about?

In this full-day, we adopt a collaborative perspective in developing these guidelines further. Drawing on principles of participatory design, we hope to include several stakeholders involved through collaborative brainstorming and the design of research documentation. We envision that the end goal of this workshop entails shared documentation that establishes best practices for research with web and social media data.

How to participate?

This workshop is open to all interested in research with web and social media data. Conference registration is open until May 28 (see details here).

Thanks to the generosity of the ICWSM organizers, we can offer registration waivers for two participants, preferably from communities under-represented at the conference. If you’re interested in applying for a fee waiver, send us a short (maximum 200 word) summary of why you’d like to attend with your name and email to until May 25. We are still exploring more potential fee waiver options, so you may want to watch out for updates on our website! ICWSM is also offering discounted rates for particpants from South America and Africa.


Research with Web and Social Media Data, and Participatory Design

Web and Social Media data is of increasing interest to several scientists since it can be used to study the attitudes, behaviours and characterics of people and society. The large scale of available data,combined with increasingly sophisticated and powerful computational tools for analysing it, have made several research avenues possible. Notwithstanding the many potentials of this burgeoning research paradigm, there are also several pitfalls due to data sampling, platform affordances, conceptual confusion in the definition of constructs to be studied, and more. While not all of these limitations can be mitigated, they can be documented to provide an understanding of the limits of a particular study, make it more transparent, and also spread awareness in the research community. To that end, several guidelines and error reporting frameworks have been developed.


Current Guidelines usually include a set of quality criteria, often linked to steps in the research pipeline, that researchers should note and document as the lack of fulfilment of these criteria can lead to systemic and/or random errors. They often target specific parts of the research pipeline such as Data sheets (Gebru et al. 2018), Data statements (Bender et al. 2018) for documenting datasets and Model cards (Mitchell et al. 2019) for (Machine Learning) Model development and deployment. Other guidelines are inspired by survey methodology error frameworks or quality frameworks such as the Total Twitter Error Framework (TTE, Hsieh and Murphy, 2017), the Total Error Framework for Big Data (TEF, Amaya el al, 2020), and the Total Error Framework for Digital Traces of Online Behavior (TED-On, Sen et al, 2021). While these guidelines are crucial for increasing the transparency of studies working with web and social media data, current approaches for guideline development are top-down and prescriptive. Furthermore, several of them were designed for Machine Learning or Natural Language Processing practitioners, which is an important subset of the web and social media research community but still misses important input and insights from social scientists.

Participatory Design

Participatory Design is an approach to design attempting to actively involve all stakeholders in the design process to help ensure the result meets their needs and is usable. Participatory design of guidelines and checklists is widely used in various domains such as the medical sciences and HCI and previous research has shown that checklist use increased when stakeholders were included in checklist design and implementation. In this workshop, we hope to provide an in-depth and detailed overview of existing guidelines and best practices for research with web and social media data, by demonstrating how the guidelines can be applied to CSS case studies. Second, we invite participants to apply the guidelines for specific vignettes or their own research studies in collaborative group activities, develop specification sheets for their study using existing guidelines, and develop an understanding of which perspectives are missing from existing guidelines.

Envisioned as the first of a series of workshops, with this workshop we want to advance a long-term conversation to include different voices in the conversations to shape future guidelines and best practice recommendations in the field of Computational Social Science research.

Planned Timeline

Session Type Time (Atlanta Time (EDT, UTC-4, CEST-6)
Opening and Introductions Plenary 09:00-09:15
Introduction to Existing Guidelines for Research
with Web and Social Media Data
Plenary 09:15-09:45
Brainstorming Research Designs Breakout Rooms + summary in plenary room 09:45-11:15
Short Break   11:15-11:30
Interactive Discussion of Case Studies that use Web and
Social Media Data through the Lens of Various Guidelines
Plenary 11:30-13:00
Break   13:00-14:00
Applying Guidelines for Documenting Limitations
of Research Designs
Breakout Rooms + summary in plenary room 14:00-16:20
Short Break   16:20-16:30
Final Group Discussion, Lessons Learned and Closing Plenary 16:00-16:30


Indira Sen is a doctoral candidate in Computational Social Science at GESIS, Leibniz Institute for the Social Sciences in Cologne, Germany. She is interested in understanding biases in inferential studies from digital traces, with a focus on natural language processing.

Dr. Fabian Flöck is a post-doctoral researcher at the Computational Social Science department at GESIS and team leader of the ‘Data Science’ team. He is interested in open and transparent data science, natural language processing, human computation, and collaborative production processes.

Dr. Katrin Weller is an information scientist working at the Computational Social Science department at GESIS and team leader of “Social Analytics and Services”. Her research focus is on social media, new types of research data and data preservation, scholarly communication & altmetrics, web users and communication structures.

If you have any questions about the workshop, feel free to reach out to us at or {firstname.lastname}