Research challenges for using the UK Web Archive for social science research

Date
Category
NCRM news
Author(s)
Jessica Ogden, National Centre for Research Methods

In July 2018, I undertook an NCRM-funded placement fellowship with the UK Web Archive (UKWA), based at the British Library. The UKWA collects and preserves UK web content with the aim of providing access to these resources to researchers and the public, in perpetuity. Currently amassing millions of UK websites (and billions of individual ‘assets’) each year1, the UKWA is poised to become a major resource for researchers interested in studying social, economic, political and cultural change in the UK over time. 

The placement at the UKWA was centred through the lens of my own ongoing PhD research in Web Science, which more generally examines web archival practices in various institutional and community contexts. I am widely interested in the critical ways that these efforts to ‘archive the web’ are changing the nature of the Web’s architecture, as well as how web archives are increasingly becoming central to the study of information circulation online. The placement fellowship, entitled ‘Research challenges for using the archived web for social research’, fundamentally sought to engage with the web archive as a source for social science research. Aiming to examine what social science researchers require from the UKWA to enable effective scholarly research, I used the UKWA to explore a set of research questions, reflexively studied the research process itself and examined the everyday practices of the UKWA that ultimately have effects on both. To help contextualise my research, I had informal discussions with British Library and UKWA staff in a variety of roles, including curators, engagement officers and technical support. I also undertook a brief review of relevant web archiving literature to contextualise the study within the wider history of web archiving, which allowed a focus on the opportunities and challenges presented by web archives for researchers.

In an effort to inform future research in this space, I developed a general conceptual framework through which to describe the challenges that a researcher wishing to use the UKWA must contend with during the early stages of this form of scholarship. Through three conceptual devices, I reflected on the various processes associated with orientating, auditing and constructing a corpus for research. These overlapping concepts encompass the practices associated with:

• accounting for various idiosyncrasies of web archive collections
• situating the archival / data sources within one’s own research paradigm and praxis
• confronting the opportunities and constraints of institutions as sociotechnical infrastructures

Orientating to the UKWA includes engaging with web archives as new ontological devices for historical research, unpicking the complex legal constraints of access, and embracing new ways of knowing data and infrastructure. It is safe to say that most researchers attempting to use the UKWA will likely have never encountered a web archive or used web archival data before. As a researcher who has spent the last four years studying web archiving, my engagement with the UKWA will likely not be representative of the ways in which a ‘typical researcher’ might orientate to the web archive as a new source of scholarship. Nevertheless, despite my expertise, the challenges I experienced whilst attempting to use the UKWA point towards a fundamental need to situate and return to the rationales, processes and communication devices that facilitate web archives. Even getting to grips with what a web archive is can be a challenging task in this context, especially when faced with the myriad forms of digital data archives which exist on and off the Web.

Over the course of the three-month placement, the access constraints surrounding the UKWA’s mandate to archive the UK Web presented numerous challenges for researcher engagement. As a significant portion of the UKWA is only available on-site in the Reading Rooms, researchers must unpick the situated legal conditions of collection and access that both enable and constrain collection and research activities. This can be especially challenging in the face of different legislation pertaining to copyright, data protection and digital publication rights online.

As a result of the placement, I provided recommendations for potential future directions to further facilitate researcher engagement with the UKWA. They included recommendations to enable greater transparency in the presentation of collection activities, tools for collating and citing archived resources, and mechanisms for encouraging interdisciplinary research collaborations on UKWA collections. 

This project required a constant process of orientating to the technology of the British Library’s infrastructure and to the everyday human and technological inventions required to facilitate access to the UKWA. By being able to directly observe the web archiving practices of the British Library, I was able to research, consult and make recommendations about the direct connections between collection activities and researcher use. This work has directly informed the direction of my PhD research and the opportunity to work with the British Library has been invaluable.

References
1 www.webarchive.org.uk/en/ukwa/info/faq