Documenting archaeological work processes for enabling future reuse of data: the CAPTURE project

by Isto Huvila (Department of ALM, Uppsala University/Sweden,

Reusing archaeological data requires a comprehensive understanding what the data is about and as increasingly acknowledged, also of how the data came about (e.g. Voss, 2012; Faniel and Yakel, 2017). The European Research Council funded research project CAPTURE investigates the latter less studied issues: what information about the creation and use of research data is needed and how to capture enough of that information to make the data reusable in the future. The data about data creation and manipulation processes is conceptualized in the project in terms of paradata (Börjesson et al., 2020), a concept that is best known in the archaeological community in the context of heritage visualisation and the London and Seville Charters (2013) on their documentation.

The aim of CAPTURE is to develop an in-depth understanding of how paradata is being created and used today, to elicit and test methods for capturing paradata, and to synthesize the findings to inform the capturing of paradata and enabling data-intensive research using heterogeneous research data stemming from diverse origins.

Documenting enough, not documenting too much

The major problem of solving the issue of capturing and preserving paradata is that different data users have different needs in different contexts and situations. However, without a good-enough documentation of the human processes of creating, understanding and interpretation of data, there is a risk of what has been described as a digital dark age (Bollacker, 2010) and the proliferation of difficult to find and access “dark data” (Geser and Niccolucci, 2016). Even if data would not turn entirely unusable, collections of digital and non-digital archaeological data might remain too difficult to use to be really usable, incapable of supporting future research and cration of new archaeological knowledge and in some cases, even worse, leading researchers to conduct research on erroneous premises and drawing false conclusions on data that has been created under incompatible premises. By developing the understanding of paradata and process documentation, CAPTURE supports the implementation of European and global open data policies (Beck and Neylon, 2012; DCC, 2017; Kansa and Kansa, 2011), and effective sharing and reuse of data in disciplinary and cross-disciplinary knowledge ecosystems (cf. Bruseker et al., 2017) by developing critical knowledge of the social context and use of infrastructures stressed in recent research agendas (e.g. ARIADNE Aloia et al., 2017 and JPI Lambourne et al., 2014) and empirical research (e.g. Mayernik et al., 2017) alike.x

The impossibility to document everything, diversity of needs and difficulty to anticipate them makes the documentation of data-related processes intricate. A major problem is how to determine how to document just enough. Similarly to all data on data (Mayernik and Acker, 2018), also this type of documentation is bound to be incomplete and as a consequence, it is important to focus on finding a reasonable balance between what can be captured automatically, should be documented manually (e.g. Stamatogiannakis et al., 2015), what is already embedded in the data itself (Huggett, 2012; Gant and Reilly, 2017), and what can be left to future users to figure out by themselves using post-hoc, forensic methods (Kirschenbaum et al., 2010) of ’excavating’ existing data (Figure 1). So far a fair amount of research has investigated each of these approaches but it has tended to happen in relative isolation.

Figure 1: Sources of paradata in data creation and (re)use processes (data flow in black, paradata flow in red colour).

Survey on the views on archaeologists’ data creation and reuse The CAPTURE project uses several different methods to investigate intellectual processes underpinning the creation and use of research data in archaeology and beyond and to propose and develop means to capture them. The methodological pallette consists of document and documentation studies, review and testing of earlier proposed and newly developed approaches for documenting paradata, and interviews with key stakeholders. At the moment, CAPTURE is also running a survey in collaboration with recently completed COST Action ARKWORK that is open for everyone who has created or used archaeological research data in their work. The survey can be found in

Keeping updated on CAPTURE

Updates on CAPTURE project activities, including workshops, online CAPTURE talks with invited speakers, and outputs can be found on CAPTURE website at and on Twitter at @CAPTURE_ERC.


CApturing Paradata for documenTing data creation and Use for the REsearch of the future (CAPTURE) project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme grant agreement No 818210.


