Show me the data: research reproducibility in qualtiative research

Date
Category
NCRM news
Author(s)
Louise Corti, UK Data Archive

In quantitative methods, reproducibility is held as the gold standard for demonstrating research integrity. But threats to scientific integrity, such as fabrication of data and results, have led to some journals requiring data, syntax and prior registration of hypotheses to be made available as part of peer-review. While qualitative research reproducibility has been questioned in the past, it has been protected from the recent transparency agenda. What if journals mandated the sharing of data and analysis for qualitative research?

These issues were addressed at a session I ran at this year’s NCRM Research Methods Festival, where a panel of speakers debated whether there was indeed a ‘crisis’ and what ‘reproducibility’ approaches and standards might look like for qualitative research.  The speakers took various positions, showing: how qualitative researchers might respond creatively to a reproducibility crisis, how various ‘crises’ surrounding transparency in qualitative research have emerged and how data sharing might help mitigate this (Sarah Nettleton); practical strategies for teaching replication in the quantitative tradition in political science (Nicole Janz); and practical examples of what reproducibility might look like, based on existing archived data collections (Maureen Haaker).

Is there a crisis? We can observe the increasing drive for openness and sharing, value and transparency in our daily lives, be it fraudulent election activities, GDPR or open access. Government, funders, professional societies and journals are driving open research mandates, declaring data as a public good, and research integrity as a vital practice. Indeed, the concept of replication has gained prominence in the research narrative where sharing of data that underpin publications can help counter mistrust in published findings. In 2012, the US political science community introduced a practical approach to encouraging replicability. The Data Access and Research Transparency (DA-RT) statement, aimed at journal editors, requires authors to submit analysis code that must be fully replicable along with their article; indeed some journals rerun code to check it.

But what is the equivalent of this exercise in qualitative research? Nettleton reminds us that typically data are co-produced, co-constructed, embedded in - and by - contexts, and the conditions of production are inextricably interlinked with process of analysis and interpretation. Further interrogation of data and coding are highly iterative processes, where decisions are likely to be hard to document. Software like NVivo might prove helpful in making available to others codes and coding choices. The US Annotation for Transparent Inquiry (ATI) initiative encourages scholars to annotate specific passages in an article by adding links and notes about data sources underlying a claim. While there is value in encouraging source data to be citable and revisited, we should not veer towards mandating the evidencing of claims.

The DA-RT initiative helpfully identifies data, analytic and production transparency as different entities. Given that much fieldwork is impossible to fully replicate, the idea of production transparency (the methods used to collect data) is likely to be more appealing to the qualitative researcher. Were journals to seek to extend their reproducibility agenda to qualitative research, they could usefully start here. Putting aside here ethical issues that can arise in sharing data, we can think about what kinds of documentation and materials might help us.

It is also useful to consider the spectrum of immersivity in qualitative research – e.g., from passive observation to participatory research or ethnography - that will likely require different layers of description. Examples of supporting materials from archived research datasets that shine light on the data and the research process from the UK Data Service data catalogue are:

  • Quali Election Study of Great Britain, 2015 (SN 6861)

  • Anti-politics: Characterising and Accounting for Political Disaffection, 2011-2012 (SN 7855)

  • Conservation, markets and justice - Part 2: Ethnographic participatory video data (SN 852476)

Data papers such as the Open Heath Data [1] further provide a further valuable outlet for describing the rationale and methods that created a published dateset.               

Nettleton recounts her experiences of archiving data from a previous study [2]. She expressed her surprise that these data have been used for teaching medical students as well as research. While she had agonised over the level of anonymity at the time of depositing data, on reflection she believes that it was a helpful experience for her. Yet archiving data cannot and should not be done in response to the transparency crisis; this could undermine trust, reinforce naïve empiricism and undermine the intellectual foundations of qualitative research. Future journal policies should appreciate that presenting context needs to be rigorous, yet not prescriptive, and be sufficiently nuanced to allow for the flexibility and messiness of qualitative research.

With the spectre of essay mills and cheating looming, providing early guidance for students on the importance of academic rigor and integrity is vital.  At RMF, we launched Dissertations and their Data: Promoting Research Integrity, a resource pack aimed at staff responsible for undergraduate dissertation support classes. The teaching builds on the programme of capacity building work done by the UK Data Service, seeking to apply core principles of excellence in project and data management, and data description to the classroom. Janz has been running ‘master classes’ in replication where findings might support or challenge the original paper. She reminds her students to be professional and diplomatic when communicating failed replications. For us, the RMF session was a safe space to debate issues of transparency. The UK Data Service will be running another session on Thinking Ahead: How to be Reproducible in Research in the New Year, so watch our training space for more details.

References

  1. https://openhealthdata.metajnl.com/
  2. https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6124