Cite this as: Roushannafas, T., Baker, P., Campbell, G., Jenkins, E., Parker Wooding, J., Pelling, R., Vander Linden, M., Worley, F. and Cooper, A. 2024 Digitally Enlightened or Still in the Dark? Establishing a Sector-Wide Approach to Enhancing Data Synthesis and Research Potential in British Environmental Archaeology and Beyond, Internet Archaeology 67. https://doi.org/10.11141/ia.67.7
The 'Rewilding' Later Prehistory project is a UK Research and Innovation (UKRI)-funded research project based at Oxford Archaeology in collaboration with the universities of Exeter and Oxford, Historic England, the Archaeology Data Service (ADS), Knepp Castle Estate Rewilding, and the Centre for Anthropobiology and Genomics of Toulouse (France). As a research project, 'Rewilding' occupies an atypical niche, being housed by a commercial body, and, within this context, is seeking to develop new modes for cross-sector research beyond traditional academic settings. One of the key aims of the project is to collate archaeobotanical and zooarchaeological data dating to the period 2500 BC to AD 43, across diverse case study areas in the Upper Thames Valley, West Sussex, the East Anglian Fens, Wales, Northumberland and County Durham.
It was within the context of a developer-funded base and concern with the collation of palaeoenvironmental data that a project workshop entitled Biofuelled Research: Capturing the Interpretative Power of Plant and Animal Remains in British Archaeology was held on 11 November 2022 at the University of Oxford, in collaboration with the School of Archaeology and Historic England. Specialists in environmental archaeology from across the heritage sector, and particularly those operating within the project study areas, were invited, with the majority working in university or developer-funded settings. Prior to this workshop, we conducted a survey of environmental archaeologists with the following aims:
Sixty specialists responded to the survey, and a summary of the results was presented at the November 2022 workshop and at the Association for Environmental Archaeology (AEA) annual conference in Glasgow in December 2022. Subsequent discussion of the workshop and survey results in relation to plans to develop Continuing Professional Development (CPD) data skills courses at Bournemouth University and with the UK-based Chartered Institute for Archaeologists (CIfA), led to a follow-up survey targeting archaeologists more broadly – particularly those in developer-funded archaeology – which was advertised via British Archaeological Jobs and Resources (BAJR). Key aims of the second survey were:
This generated a further 46 responses, of which many provided considerable additional detail and reflection on the questions posed. The gathering of over 100 responses represents a crucial first stage in engaging with the sector directly to establish steps towards improving data management and research opportunities at all levels. The results are outlined in the following sections, and contextualised within a wider range of initiatives promoting data accessibility and openness. They are also considered alongside ongoing challenges and new opportunities for enhancing research potential within environmental archaeology and British archaeology more broadly.
The first survey of environmental archaeologists was distributed directly to those participating in the project workshop Biofuelled Research: Capturing the Interpretative Power of Plant and Animal Remains in British Archaeology, and more widely via JiscMail list servers for archaeobotany, zooarchaeology and environmental archaeology, as well as the Historic England-led Archaeobotanical Working Group (AWG) and Professional Zooarchaeology Group (PZG). For a full list of the survey questions and (anonymised) results see Appendix A [pdf] and Appendix B [csv].
Of the 60 individuals who responded to the survey, 50% worked in developer-funded contexts, 25% were based in universities and the remainder were divided between independent research organisations, public bodies and the self-employed (the latter group also largely engaged in commercially funded projects). Responses were split almost evenly between plant and animal remains specialists. Over 70% (n=43) of those surveyed stated that they had been working in their specialism for more than 10 years. Poorer representation of early-career specialists may, in part, have resulted from the channels by which the survey was distributed, but could also reflect a smaller pool of trained individuals within this demographic. Of those in their specialism for fewer than ten years, 47% (n=8) were based at universities, suggesting that there is potential for greater integration of early-career specialists in developer-funded contexts into research networks and conversations.
Part one of the survey addressed the routine practices used by plant and animal remains specialists in recording and archiving their data. Survey results indicated that Microsoft Excel was the most commonly utilised recording software, employed by 80% of specialist respondents, either alone or in combination with other software/formats. Responses showed that while the use of text documents and paper recording was still common, this was always used in combination with spreadsheet or database software. As such, the digitisation of data appeared to be universal, although the ease with which these formats could be integrated with wider data infrastructures clearly varied. Indeed, both anecdotally and as observed in the context of 'Rewilding' project data collection, 'primary' spreadsheet (or database) data was not always submitted to contracting units/project managers alongside the text-based report, and therefore could not be incorporated into project archives. Paper recording (e.g. scanned scoresheets) was sometimes included in digital and/or physical archives, but often comprised preliminary assessments and was challenging to collate and digitise.
The survey also contained questions regarding familiarity with OASIS V and archival procedures. OASIS V is the current version of the online reporting system supported by the ADS and is the primary means by which data pertaining to archaeological investigations is fed through to regional Historic Environment Records (HERs) and national heritage organisations in England and Scotland. In the context of palaeoenvironmental work, OASIS V represents an important signposting tool in allowing findings to be highlighted via the use of 'keywords' (e.g. 'plant remains', 'vertebrate remains') that are attributed to periods. However, these terms are limited and, as observed during project data collection, not necessarily consistently entered by those responsible for completing the form, nor are the OASIS records themselves searchable outside the organisations that created them.
Limited familiarity with OASIS, and archival procedure more generally, was highlighted in the responses, which included statements such as 'unclear of office procedure' and 'not sure of our practices', and comments highlighting that training and knowledge of these systems was often limited to archivists – a small (and arguably often undervalued) set of specialists mainly based within larger fieldwork organisations. While submission of material for archiving is outside the usual remit of environmental specialists, it does raise the question of whether lack of familiarity with archival procedure could negatively impact the degree to which specialist data is appropriately formatted for archiving and the degree of agency specialists have in ensuring their data is catalogued – particularly for the self-employed. When asked about dissemination and storage of specialist datasets, beyond uploading reports through OASIS (either by themselves or by their organisations), many responses cited 'in-house' data repositories specific to their organisation or expressed uncertainty. Most of those citing in-house data storage were from commercial backgrounds, and it is likely that many responses were referring to storage of data on internal servers. However, it is possible that some responses were referring to open-access repositories such as those hosted by universities. The term 'organisation-specific repository' would therefore benefit from more precise definition in future surveys, as would the distinction between specialist reporting and raw data. A range of other digital platforms was also cited as being used for disseminating specialist research, including the ADS, Academia.edu, ResearchGate and project-specific websites (Figure 1), with 'other' responses including Zenodo, Open Science Framework and sharing by 'personal request'. Notably, one participant responded 'I have no control over this'.
Overall, the responses suggest that, while many specialists are looking to engage with digital platforms to disseminate their findings, in many cases raw data is likely to be accessible only via access to in-house servers/repositories or by contacting the specialist directly (who may then need to obtain permission to share said data). A commonly reported lack of familiarity and/or involvement with archival procedure suggests that, at least in some cases, specialists are not aware of what happens to their data/reporting once the analysis is completed, which is in itself a barrier for the reuse and research potential of those findings. While many specialist reports are made available as grey literature in the ADS library via OASIS V, these are usually done so as part of larger site reports, meaning they are not themselves indexed or easily located. Even those specialist reports published in excavation monographs and county journals are not necessarily well signposted and may not be accessible without institutional/individual subscriptions or university library access. Regional journals can be particularly problematic in this sense, and may be subject to lengthy embargoes.
The second part of the survey addressed research aspirations and challenges amongst environmental archaeology specialists across the sector.
When asked whether their current role allowed them to integrate their specialist findings into synthetic research (i.e. addressing archaeological questions at broader regional/thematic/chronological scales), responses indicated that, particularly in developer-funded contexts, opportunities to do so were limited, and usually undertaken in the specialists' own time (Figure 2). This finding is crucial, because limited opportunity for this type of contextualisation negatively impacts the potential for specialists to assess the wider relevance of their work, which will doubtless in turn affect the dissemination and communication of those findings. This then limits the potential for developing future research questions and appropriate fieldwork methods. At a personal level, the inability to translate specialist findings into their broader archaeological significance is likely to impact an individual's motivation in pursuing research, as well as potentially discouraging early-career archaeologists from entering or remaining in the field.
That there is more time for synthetic research for those employed by universities, public bodies or other research organisations is perhaps not surprising. Planning legislation within Britain – currently covered by the English National Planning Policy Framework (NPPF), Planning Policy Wales (PPW) and Scotland's National Planning Framework 4 –requires the assessment and recording of heritage assets (including archaeology) impacted by any proposed development. Developers are often only required to fund reporting of archaeological remains specifically within their development site, rather than archaeological research more broadly. While 'value-added' work can be agreed in some instances, this usually relies on a sympathetic client and otherwise obliges the unit to either fund research initiatives itself or to seek external funding. However, this is not to say that current planning policy necessarily precludes specification of appropriate local area-scale synthesis within project designs, for example see The 21st-Century Challenges for Archaeology Programme (Work Package 4.2).
Despite these challenges, the survey results also demonstrated that most specialists do try to make time to engage with broader research questions and to disseminate their findings in a range of formats, including academic and public talks, websites and social media, as well as publications. Furthermore, all respondents indicated that they would be interested in engaging in more synthetic research if this could be integrated within their roles. Key barriers to broader interpretation of specialist findings were cited as:
As one specialist noted:
"As someone working in commercial archaeology, the main challenge is access to data and literature resources through e.g. a university library. To some extent this can be mitigated through existing resources that I have, or through online repositories, e.g. Academia.edu, ResearchGate etc. However, sometimes useful volumes/papers are only available by purchasing them or I have to complete the analysis without them."
Another respondent reported a significant barrier as:
"Time to find more comparative sites – the reviews and datasets available are great, but many were compiled several years ago and it is hard to find the time to trawl libraries to access more up to date reports."
Many of those surveyed indicated they made use of existing research resources, which signpost previous palaeoenvironmental analyses, to contextualise and interpret their findings. These included the Historic England-led regional reviews and associated datasets (e.g. Huntley and Stallibrass 1995; Albarella 2008, 2019; Hall and Huntley 2009; Hambleton 2009, 2010; Serjeantson 2011, 2012; Wilkinson 2011; Holmes 2017, 2018; Carruthers and Hunter Dowse 2019), Historic England's searchable database of research reports (previously known as the Ancient Monument laboratory reports and English Heritage Centre for Archaeology reports, and including the regional reviews already cited) and the Archaeobotanical Computer Database (ABCD; Tomlinson and Hall 1996). While these are clearly useful resources, the latter is now very out of date, and the regional reviews cover specific regions and periods, and do not, yet, have a clear mechanism by which they can be updated routinely.
In summary, the responses indicated that, while specialist palaeoenvironmental data is now almost exclusively 'born- digital' (Richards et al. 2021), the flow and management of this data is negatively impacted by limited familiarity with (and access to) archival procedure and suitable data repositories, as well as the need for better data management plans. These problems feed into difficulties in conducting synthetic research, attributable to both a non-conducive working framework (which encourages a rapid turn-over of site-specific reportage) and the limited accessibility of specialist data ( whether this is due to paywalls or simply to a lack of adequate signposting) that would allow environmental specialists to locate the best comparative sites, for example.
The follow-up survey was designed to elucidate further the training needs in data management and analysis highlighted by the first survey, and was disseminated via BAJR on social media. This platform was used with the purpose of reaching a wide range of practitioners who engage directly with topics of working conditions and standards within the developer-funded sector.
The results of this survey (Appendix C [csv]), described in the following paragraphs, are tied into the findings of the initial survey where questions were sufficiently similar to do so, although it is acknowledged that there were slight differences in wording to some questions (see Appendix A [pdf] for a full list of survey questions). In the second survey, 30/46 respondents (65%) stated they worked for developer-funded units, with the remainder divided between universities, public bodies and independent research organisations. Respondents covered a remarkably diverse range of specialisms, including excavation, geomatics, survey, buildings recording, finds, geoarchaeology, human remains, community outreach and archaeological consultancy. Less-experienced practitioners were slightly better represented than in the previous survey, with approximately one-third (n=15) stating they had been in the sector for less than five years, and just under a third stating they had been in the sector between five and ten years.
In both surveys, respondents were asked when they had last received training in data management (worded as 'database skills' in the first survey) and data analytical skills. The results suggested that training opportunities were particularly limited for those working in a freelance capacity or developer-funded contexts, with fewer respondents having received training in the last five years compared to other sectors (Figure 3). However, across all sectors (including developer-funded, university-based and research/public bodies) the majority of respondents in both surveys (63%, n=67) indicated that they had not received training in data management or data analytical skills within the last five years. In relation to this point, a participant in the first 'Rewilding' project workshop observed that, within developer-funded archaeology, the 'training up' of junior staff is often prioritised over training for mid-career and senior staff. It is possible that the junior members of staff demographic is also more likely to actively seek out training. The survey responses seemed to reflect this divide, with the percentage of those reporting no training within the last five years increasing to 72% (44/61 respondents) among those with more than ten years' experience. However, this difference between experience groups is not clear cut, with survey responses also reflecting on the often more precarious job security amongst junior staff, which may also mean that companies are less willing to invest in their training. The impact of the COVID pandemic on training within the five-year timespan also needs to be considered as a potentially disruptive factor.
Across both surveys, 41% of respondents (n=43) indicated that where they had received this type of training it was entirely through 'on-the-job' learning, including via professional mentoring, while 54% (n=57) cited university courses, either solely or in combination with bespoke training or 'on-the-job' learning. However, opportunities to attend formal/certified training courses while in employment appeared to be limited, with only 28% (n=30) of the total 106 responses across both surveys reporting training in data management/analysis being provided by their employers as CPD. When those in the second survey were asked if provision for CPD training was made by their employer, 50% (n=23) answered yes, with the remainder being almost evenly split between 'no' (n=12) and 'don't know' (n=11).
Over both surveys, 92% (n=98) indicated an interest in receiving further training in data management and/or data analysis. When asked more specifically about areas of interest in the second survey, participants included data visualisation and exploration, statistical analysis and open and 'FAIR' (findable, accessible, interoperable and reusable; see Wilkinson et al. 2016) data practices among their priorities. When questioned regarding the types of digital tools they would be interested in learning to use, responses in both surveys indicated a strong interest in improving knowledge of statistical, geographical and database management systems and languages, including R, Excel, Python, geographic information systems (GIS), Access and SQL, suggesting that training needs span both emerging and established methods for managing and analysing data (Figure 4).
When specialists in the first survey were asked why they felt they would like further training in these areas, common themes included concerns that skills were becoming outdated, the need to review current working practices from an informed position, and the desire to interrogate, compare and communicate larger, more complex datasets effectively. Beyond the desire for specific training, responses among environmental archaeologists communicated a need for opportunities to confer with other specialists about procedures for recording and analysis of assemblages, in order to identify approaches that maximise the potential for data reuse and reproducibility. These needs are currently met, at least in part, by existing working groups, such as the Historic England-led AWG and PZG and active JiscMail servers for archaeobotany, zooarchaeology and environmental archaeology. However, considering these responses, we may need to reflect on the potential for flexible digitally orientated platforms in the future, as well as the continuing importance of regular meetings and cross-sector gatherings that focus on the wider interpretative outputs of palaeoenvironmental research alongside the immediacies of recording practices.
When asked about specific training options in the second survey, 67% (n=31) stated that they would prefer the option of online training, with no clear preference for course length (i.e. short courses, course series or longer academically accredited courses). When asked if they would be willing to self-fund training, 74% (n=34) answered yes, but, of these, the majority stated that it would need to be within a very limited budget, with some stating specifically that they would be willing to add their own funds to existing CPD budgets to do so.
Additional commentary indicated that respondents were navigating a range of conditions, including childcare, poor job security and working environments that did not encourage training and/or ensure that skills/knowledge were passed on and retained. One respondent noted that the biggest barrier they encountered in accessing training was:
"[compelling]…management to pay for me to learn rather than hiring someone else ... It would need to be a reasonably low amount of time and money. I don't know how I'd fit in much more workload, but would love the opportunity to learn."
Support from management, time and funding were commonly cited as barriers to accessing relevant training, however limited availability, and knowledge, of appropriate courses was also highlighted. One respondent noted that construction-related certification (e.g. the Construction Skills Certification Scheme (CSCS), Quarry Passport, etc.) takes priority, and that we may need to consider the impact of increased regulation in this industry on the time available for archaeology-specific training. Calls were also made for a more thorough grounding in statistics and other data skills – including more training in digital fundamentals, such as the effective use of spreadsheets – at a university level before archaeologists enter the workforce.
Responses to both surveys demonstrated a significant appetite for improving data literacy, management and openness across specialisms. While time, inadequate signposting, poor data standardisation, paywalls, funding and facilitation of training by employers were, perhaps unsurprisingly, key barriers to research and training opportunities, responses also indicated that many are keen to update their skills and are positive in their attitude towards improving the management and availability of archaeological data so that it can be used to its full research potential. None of the responses appeared to reflect proprietary attitudes to archaeological data that have often persisted in the past, and this may reflect a broader shift from ideas of data ownership to data stewardship (Marwick et al. 2017). Indeed, one survey respondent noted:
"All archaeological data is significant, if we can ensure it is as open as possible, FAIR and with the requirements for Data Management Plans, we will save time, effort and our own histories."
Many survey respondents are doing their best to navigate existing research landscapes, balancing a complicated array of work and home responsibilities, finding time to conduct research where they can, and making use of online training, resources and repositories. It would seem the challenge lies not primarily, therefore, in changing the attitudes of data creators, but in providing better opportunities for practitioners to develop their research skills and to engage with wider research and data management initiatives. This requires a shift towards a working framework that gives practitioners more time and space to access relevant training and research resources. Such a shift would not only constitute a wise investment in terms of improving workflows, but should also be motivated by the fact that archaeology needs to successfully engage and communicate its findings with broader audiences to maintain relevance, and thereby financial and public support.
The survey responses reflect an increasing cross-sector awareness of the challenges and opportunities for archaeological data. Recent discussions of these issues include effective integration of environmental specialist data into broader archaeological research priorities (Campbell et al. 2018; Pearson 2019), access to, and integration between, archaeological data repositories (Wright and Richards 2018; Richards et al. 2021; Tsang 2021; Geser et al. 2022), archaeological data literacy (Kansa et al. 2020; Kansa and Kansa 2021) and FAIR data principles in archaeology (Marwick et al. 2017; Lodwick 2019a; 2019b; Karoune and Plomp 2022). While these existing academic studies are extremely important, they are also somewhat removed from the wider community of specialists who create the vast majority of UK archaeological data under discussion. The findings described here contribute to these debates by contextualising more abstract arguments about the challenges and opportunities for archaeological data within working experiences and the needs of the data creators, i.e. palaeoenvironmental specialists and archaeological practitioners more widely in Britain. We believe it is crucial that discussions concerning the future of archaeological data and interpretation take place across the sector and are grounded within the practical realities of the day-to-day creation and use of that data.
Positive steps towards improving data accessibility may be seen in the increasing number of archaeological research projects that are actively engaging with principles of good data management and reuse potential – see, for example, the Prehistoric Grave Goods Project (Cooper et al. 2021; 2023), The Rural Settlement of Roman Britain (Allen et al. 2018) and Feeding Anglo-Saxon England (FeedSax) (McKerracher et al. 2023), all of which have made queryable digital data (including results from developer-funded investigations) freely available online. We have also seen increasing online accessibility of data from high-profile developer-funded projects, such as works connected with the Channel Tunnel Rail Link/High Speed 1 (Foreman 2018), the Heathrow Terminal 5 excavations (Framework Archaeology 2011) and the A14 Cambridge to Huntingdon Improvement Scheme (Smith et al. 2021). Once completed, archaeological investigations relating to High Speed Two (HS2) are likely to produce the largest digital archive of developer-funded archaeology yet (High Speed Two Ltd. 2023). Current emphasis on FAIR principles specifically within environmental archaeology is apparent from a range of recent initiatives, including the recent AEA conferences on Open Science Practices in Environmental Archaeology (2021) and Data Science in Environmental Archaeology (2023), and Open Research Training Workshops organised by the International Committee on Open Phytolith Science (2023). The development of openly available R packages tailored to environmental archaeology is also promising, including the recently developed CropPro for analysing evidence of crop processing and WeedEco for investigating weed ecology (Stroud et al. 2023a; 2023b). Online training in R aimed at archaeologists is also accessible via the Data Carpentry initiative. Many open science initiatives have taken advantage of the move towards the use of hybrid and online platforms for conferences, seminars and workshops in the wake of the COVID pandemic, thereby improving accessibility. This development was commented on specifically by a respondent to the first survey:
"One development…of particular use in recent years has been the increase in conferences where attendance is possible online. Unlike academic-based researchers, commercial archaeology rarely has the funds for in-person attendance, meaning commercial archaeologists can be cut off from new developments in research. I have attended more conferences virtually in the past couple of years by going online than in the past maybe 5–10 years before that. However, as the worst of the COVID pandemic is hopefully behind us…more conferences are now returning to in-person attendance only."
Efforts to integrate regional research tools are underway in the form of the ongoing joint initiative by Historic England, Historic Environment Scotland and The Scottish Archaeological Research Framework (ScARF) to develop an integrated Research Frameworks Network, which already allows practitioners to signpost relevant frameworks when reporting investigations via OASIS V. Similar efforts for integrating regional resources can be seen in initiatives such as the Society of Antiquaries of London's 2022 workshop on Encouraging Syntheses, which focused on improving the potential of existing HER data and means of standardising practice between regions. At an international level, current plans for developments to the ARIADNEplus data infrastructure are promising in terms of integrating a wide range of data-rich archaeological archives across Europe.
Despite these advances, an infrastructure for the routine archiving of digital data generated by developer-funded archaeology has remained under-developed, both in terms of knowledge and resources (Tsang 2021). When asked about training and CPD in the 2020 biennial CIfA survey, respondents highlighted that 'finding the time to complete CPD' was the most difficult challenge faced, followed by other issues including 'finding relevant CPD' and 'cost'. To address this feedback, and in response to recommendations from various reports and projects focused on research synthesis (e.g. Cattermole 2017; Mendoza 2017; Wills 2018), CIfA and Historic England have been involved in developing and hosting freely available resources and training materials in the form of online 'toolkits'. Designed in partnership with specialist consultants, groups and organisations, the toolkits include guidance and resources for archive selection, specialist finds reporting, recording archaeological materials and managing digital data, with more in development. The toolkits promote a consistent approach and industry good practice for practitioners at all stages of their career, the aim being to provide some of the tools required to facilitate better research synthesis opportunities in the future. The online toolkit for managing digital data, Dig Digital, was created for the Archaeological Archives Forum by DigVentures in partnership with CIfA. It includes a dedicated infosheet [pdf] aimed at those involved in the 'collection, management and curation' of specialist finds data, encouraging early communication and consideration of data types, format and requirements between all stakeholders from the project outset. The Dig Digital toolkit also includes data management plan templates, and guidance to help support this process and to ensure FAIR principles are being adhered to. Historic England also has an accessible in-house toolkit for archaeological digital archiving, ADAPt, which can be used by external organisations.
Overall, across the sector, positive steps are being made to improve data integration and to embrace open and FAIR principles more widely, and these trends are reflected in the attitudes of the archaeological practitioners surveyed. However, it is also evident from the survey responses that there is significant scope for supporting non-archival specialists further in the management of digital data, as well as improving knowledge of the pathways for archaeological data storage at a very general level. At the heart of these challenges is the burgeoning need for available training material to be matched with improved resource signposting, as well as a shift in working practice that gives specialists the time to undertake and embed this type of guidance into their routine practices. While it is clear that challenges remain for archaeological practitioners who want to maximise the research potential of their archaeological findings, these gradual shifts in the disciplinary landscape constitute fertile ground for making practical changes across the sector – and not just for the 'blue-sky thinking' of academic discourse.
It is clear that there is scope for improving the archiving and availability of raw specialist data. The issue of where best to deposit data is complicated and needs further discussion within the archaeological community. Long-term suitability of data archives can be evaluated by checking for CoreTrustSeal certification. There are two CoreTrustSeal-certified repositories that cater specifically for archaeology: the ADS and The Digital Archaeological Record (tDAR). These organisations are important in that they provide an advisory service for archaeologists as well as operating as trusted data repositories – they can help archaeologists navigate best practice in data deposition, and offer the prospect that the data will be preserved in perpetuity. Both the ADS and tDAR have associated fees, which can limit their accessibility for individuals and smaller organisations, however the former does offer an Open Access Archaeology Fund designed to support the publishing and archiving costs of researchers with no means of institutional support. A range of free online data repositories are also available, including the UK Data Service, Mendeley Data and Zenodo. UK research bodies support the deposition of research data with certain free data repositories (e.g. Zenodo). However, questions remain regarding how the maintenance of these repositories is funded and thus how secure their service is. The Arts and Humanities Research Council (AHRC)'s recent funding call for a project to develop a suite of digital research services for heritage science, including a repository for research data and software, is likely to provide another option for specialist data deposition in the future.
Taking these issues into consideration, it is readily apparent that specialist data archiving needs to be integrated better within existing (in-house) working frameworks. Within organisations, greater consideration needs to be given to the fate of specialist datasets, and workflows should be designed to integrate this data within the wider archive (in appropriate formats). Such actions would be in line with commitments to ensure the availability of archaeological outputs, as specified in the CIfA Code of Conduct. In addition, it would be advantageous for concise guidance on preparing and submitting digital data for archiving to be disseminated directly to specialists via the specialist networks themselves (see further in the following sections), as well as raising awareness of existing resources such as the CIfA toolkits already described. Such guidance would enhance the future reusability and availability of such data, as well as attending to calls for more guidance regarding data recording, accessibility and metadata standards (e.g. Campbell et al. 2011; Mays 2017; Baker and Worley 2019; Bayliss and Marshall 2022; Mays et al. 2023). With developer-funded units increasingly depositing project data with the ADS, particularly in relation to large-scale infrastructure projects, it is important to ensure that raw specialist data is routinely included so that it can be integrated effectively into future research.
For the purposes of synthetic research, data needs not only to be openly available but also integrated and queryable into wider digital site records (Buckland et al. 2022). As regards palaeoenvironmental data within Britain specifically, the 'Rewilding' project is now working with Historic England and the ADS to create an OASIS+ module for logging summary information from archaeobotanical and zooarchaeological assemblages, as part of the process by which fieldwork results are reported at county and national levels. By centralising select assemblage-level data and signposting relevant datasets, the module aims to improve the potential for specialists and other researchers to discover, draw and build on previous analyses from relevant sites in their interpretations. The module is being developed through consultation with a working group representing zooarchaeologists and archaeobotanists from developer-funded archaeology, universities and public bodies. An initial design for the module has recently been presented to a broader range of specialists, archivists and wider heritage professionals for discussion and testing. Further testing will be undertaken in order to provide ample opportunity for specialist feedback, with the aim of accommodating the needs of a wide variety of data producers and consumers. While plant and vertebrate animal remains are the initial primary focus, there is potential for extending the model to encompass other categories of ecofacts, including mollusc and insect remains. Community defined OASIS+ modules already exist for logging select data and information about geophysical surveys and burial spaces, and are currently being developed for human remains. These modules have the potential to signpost not only specific forms of evidence, but also the location of the associated datasets.
It was noted in the first 'Rewilding' project workshop that efforts to standardise recording of archaeobotanical and zooarchaeological data have been ongoing for many years. It was also argued that fundamentally environmental archaeology is not a mechanical but an interpretative process, which a diverse research community will inevitably approach in differing ways, in the context of the varying specifics of assemblages and associated research questions (see also Albarella 2017), as well as funding. This is not to claim that we cannot bring in structures to improve data standardisation, however there is a stronger sense that what specialists can most easily improve is transparency in methodology and the process of data creation. Structures such as the OASIS+ modules operate on the basis that it is much easier to standardise metadata than it is to standardise all data and working practices, and aim to provide people with the information they need to locate and access data for comparative, synthetic and strategic purposes.
Within the context of all these developments, the fundamental challenge lies in informing and supporting archaeological practitioners who operate under a variety of often time-poor circumstances. Communicating through existing networks with which archaeologists from across the sector regularly engage, whether these be sub-disciplinary (e.g. specialist working groups such as the PZG), chartered association (CIfA), industry-specific (BAJR) or public bodies (Historic England), is clearly essential. However, as seen in this review, it is apparent that practitioners regularly encounter barriers to accessing the resources they need. We therefore believe that the following resources would benefit from improved signposting.
More broadly in terms of communication, we would advocate a community-minded approach that features regular consultation with archaeologists across the sector. This includes, as in the surveys described here, asking practitioners specifically what challenges they face and the types of solutions they believe could be practically implemented within the context of their day-to-day work.
Availability of appropriate training in data management and analysis should be regularly reviewed, with advice made available to practitioners regarding access to CPD opportunities and funding. The CIfA Code of Conduct, which applies to both CIfA-accredited archaeologists and registered organisations within the UK, states that:
"The member shall recognise the aspirations of employees, colleagues and helpers with regard to all matters relating to employment, including career development, health and safety, terms and conditions of employment and equality of opportunity."
As such, all CIfA-registered organisations have a responsibility to provide essential training for their staff and to support their CPD. Having the opportunity, means and time at all career stages to undertake CPD is essential. This not only supports specialists in staying up-to-date with innovations in techniques and methodologies, but can also help provide the updated skill sets necessary to maximise the potential that developments like 'big data' present for the wider discipline. The increased generation of environmental data from developer-funded archaeology provides the chance to embrace data connectivity, explore greater research synthesis opportunities, and contribute more actively to current and future environmental debates and agendas. But at the foundation of this potential is the need for greater consistency in data collection, analysis, archiving and accessibility, which can only be achieved through a combination of communication and knowledge exchange facilitated by a continuous cycle of learning.
The use of fundamental concepts such as 'tidy data' (where each row contains information related to a single observation, and each column to a single variable), and of open, non-proprietary formats (i.e. not relying upon software under commercial licensing), such as .csv tables, remains comparatively limited in archaeology, and especially development-led archaeology. Effective improvement in data availability and integration, eventually leading to rightful implementation of FAIR principles, therefore requires the acquisition of a variety of skills in data management and literacy, especially through training in open-source resources such as R (already extensively used in archaeology, e.g. Carlson 2017) or Python (very popular in data science, but less so in archaeology, although see Maier et al. 2023), and the variety of tools these provide.
Training and exposure to R or Python will not only provide an entry point into consistent, increasingly standardised, data organisation, a necessary step for publication in a data repository, but also into the application of an array of existing quantitative analytical techniques. These include simple statistical tests and visualisation solutions, such as exploratory data analysis, as well as more complex multivariate or domain-specific techniques, for which numerous existing software solutions exist, especially under the form of dedicated R packages (e.g. Bchron, oxcAAR and rcarbon for radiocarbon dates, vegan for community ecology techniques relevant to archaeobotanical and zooarchaeological datasets — see ctv-archaeology and open-archaeo for these and more resources — as well as the archaeobotanical packages WeedEco and CropPro).
Corresponding training ought to cover a gradual introduction to the fundamentals of data creation, import and manipulation, simple statistics and visualisation, and eventually more complex analytical tasks. While such formation is often provided under the form of a single offering, for instance as part of a wider master's course, in the case of the wider commercial sector, flexibility is paramount. A bespoke – online and/or hybrid – CPD course could alternatively provide a series of independent yet incremental modules, built to fit around individual work programmes, and, when possible, making use of real datasets drawn from ongoing archaeological work for practical exercises.
A clear outcome of the surveys was that, despite having the requisite experience and interest, there is often limited time and opportunity for palaeoenvironmental specialists working in developer-funded archaeology to engage in synthetic research. While not yet commonplace, in recent years there have been examples of successfully secured funding streams supporting research in industry contexts. These include the 'Rewilding' Later Prehistory project itself (funded by the UKRI and based at Oxford Archaeology) and The Rural Settlement of Roman Britain (Allen et al. 2018), which was developed from pilot projects undertaken by Cotswold Archaeology and funded by both Historic England and the Leverhulme Trust, although, significantly, the latter funding was allocated specifically to the academic collaborators on the project – the University of Reading and the ADS at the University of York. Also notable were studentships funded by Highways England relating to the archaeology of the A14, awarded in 2020 and delivered in conjunction with the University of Reading, Headland Archaeology and Museum of London Archaeology (MOLA). An example of in-house funding can also be seen in Oxford Archaeology's 50th Anniversary Research and Public Engagement Fund, which has recently allocated £30,000 to applicants from unit staff to carry out a research and/or public engagement project, independent of their work for the organisation. Within environmental archaeology specifically, small research grants are offered by the AEA, including allocation for 'time buy-out for those working in the commercial sector and wishing to carry out research beyond that funded by developers'. Despite these positive examples, however, it is obvious that limited opportunities remain for commercial units to access external funding. Without Independent Research Organisation (IRO) status (which can be challenging both to secure and maintain), developer-funded units are excluded from the majority of research council funding (the UKRI Feature Leaders Fellowships behind the 'Rewilding' Later Prehistory project and the MOLA Measuring, Maximising and Transforming Public Benefit from UK Government Infrastructure Investment in Archaeology project being welcome exceptions). This funding ineligibility exists despite the fact that these units house considerable expertise for undertaking such research, and the potential for such funding to address any identified 'skill gaps' attributable to the existing model.
In line with its aims and role as an industry-based research project, 'Rewilding' Later Prehistory worked with partners at Historic England, CIfA and Bournemouth University to ask archaeological practitioners directly about what was needed to improve digital literacy and the research potential of the decidedly 'big data' generated by developer-funded archaeology in Britain. These surveys were stimulated by significant challenges identified by the 'Rewilding' project in locating relevant palaeoenvironmental reports and data. Specialist reports and/or raw data are frequently stored in inaccessible locations, whether these be organisational servers/repositories, personal computers, or archived floppy disks, CDs and microfiche. While many specialist reports (if not raw data) are made available online as grey literature via OASIS reporting, these reports are not themselves indexed, nor are the specialist reports in published excavation monographs and county journals. It is not, therefore, possible to search up-to-date records for particular categories of environmental evidence by, for example, region, period or specialist.
Responses to the surveys highlighted a number of key issues, including limited signposting and standardisation of data, restriction of research materials behind paywalls, and insufficient (paid) time and support to undertake both training and research. These research challenges are especially pronounced in developer-funded settings, which have limited access to additional external funding. However, responses also indicated a considerable appetite across all sectors for improving data management and analysis skills, with a view to utilising data to its full potential, openly and on a cooperative basis. These attitudes would appear to be in-line with a broadly positive movement towards open science practices and improved data integration in archaeology more widely. Reflecting on the survey responses in the context of these developments, we make specific recommendations regarding guidance, signposting, communication and training, and have sought to highlight some useful resources herein. While we acknowledge that many challenges remain, we emphasise the significant potential of a community-minded approach, which maintains open dialogue with a range of practitioners and is mindful of the constraints they operate under.
Appendix A: Survey Questions [pdf]
Appendix B: Survey 1 results (anonymised) [csv]
Appendix C: Survey 2 results (anonymised) [csv]
With thanks to Lisa Lodwick, who inspired this paper and underlined the importance of asking the specialist community what was needed to bring about better research opportunities and data practices.
The 'Rewilding' Later Prehistory project is funded by the UKRI (MR/W00755X/1). Work on this article, and on the 'Rewilding' project more broadly, has been a profoundly collaborative effort, involving Historic England, the Archaeology Data Service (ADS), the University of Oxford, Bournemouth University, British Archaeological Jobs and Resource (BAJR), the Chartered Institute for Archaeologists (CIfA), and numerous other specialist contributors. Whilst the terms of the survey specified that outputs would remain anonymous, we, of course, thank the survey and workshop participants for their time and valued contributions, without which this paper would not have been possible.
Internet Archaeology is an open access journal based in the Department of Archaeology, University of York. Except where otherwise noted, content from this work may be used under the terms of the Creative Commons Attribution 3.0 (CC BY) Unported licence, which permits unrestricted use, distribution, and reproduction in any medium, provided that attribution to the author(s), the title of the work, the Internet Archaeology journal and the relevant URL/DOI are given.
Terms and Conditions | Legal Statements | Privacy Policy | Cookies Policy | Citing Internet Archaeology
Internet Archaeology content is preserved for the long term with the Archaeology Data Service. Help sustain and support open access publication by donating to our Open Access Archaeology Fund.