Cite this as: Tenzer, M., Pistilli, G., Brandsen, A. and Shenfield, A. 2024 Debating AI in Archaeology: applications, implications, and ethical considerations, Internet Archaeology 67. https://doi.org/10.11141/ia.67.8
Although it might seem recent, given the current hype around Large Language Models (LLMs) and generative Artificial Intelligence (AI) models for content generation (such as ChatGPT), AI is not a new development, and deployment of the technology in the fields of archaeology and heritage studies with both object and remote sensing applications has been widely documented (Bickler 2021). However, given developments and advances of AI tools in the field of text-based analysis specifically, this will be the primary focus of this article.
The term Artificial Intelligence was coined in 1956 (Russell and Norvig 2016) and described a hypothetical computer technology developed by Alan Turing (Turing 1950). Following the first AI hype in the 1950s and 60s (over-promising the capabilities of AI technology but under-performing due to the lack of computational power), AI research was interrupted by the AI winter of the 1970s and early 1980s. However, after 60 years of exponential growth, AI tools have now entered the mainstream e.g. chess computers, recommendation systems, and spam filters. Other applications are now leveraging the recent developments in LLMs e.g. the Google search function, instant translations, and closed captioning.
Increasing computational capabilities have enabled the development of Machine Learning (ML) and Neural Networks (NN). In particular, Deep Learning (DL) with its ability to learn features of interest in parallel, e.g. the attention mechanism in LLMs, pushed AI capabilities. These systems are particularly good at detecting correlations and patterns, and can categorise, predict, or extract data in the context of natural language processing. LLMs, such as Google's BARD, OpenAI's ChatGPT, or Meta's LLaMA now form the basis of a new generation of Open Source LLMs, such as Open Assistant (Köpf et al. 2023). These tools can learn and draw from extensive datasets that are based on the wide knowledge of the Internet, including data from, for example, Wikipedia, GitHub, and Google data search.
Following an early adoption of AI technologies in archaeology for objects and remote sensing applications (Bickler 2021; Argyrou and Agapiou 2022), Natural Language Processing (NLP), ML and DL are now being used for processing vast amounts of data accumulated over decades of research. This knowledge deposited in archives and grey literature can be efficiently analysed, structured, and disseminated using AI technologies, an approach that offers new insights and knowledge extraction from archaeological archives on a scale that has never been seen before.
However, while the deployment of AI technologies based on LLMs are capable of processing big data in archaeology and other fields, their application also has ethical implications. The lack of transparency of content and quality of the training data has been shown to reinforce social inequalities, misinformation, privacy issues, racial discrimination, risk to natural resources, and human workforce exploitation. Some of these are the same concerns across the discipline of archaeology and cultural heritage management (CHM), specifically regarding sensibilities around privacy, bias, and model creation in the context of policy and decision-making.
In this article, we focus on archaeology as part of that wider debate and present examples of successful AI applications in archaeology with text-based analysis as a primary focus. We then provide insight into the ethical implications associated with AI before discussing the implications of its use and its application in a safe, sustainable, and socially just way in the future. We want to initiate the discussion as to whether AI is a blessing or a curse for the discipline.
Archaeologists have a long tradition of adopting, adapting, and introducing technologies from other disciplines. For example, the pantograph preceded digital photography or survey methods (Novaković 2018), Lidar has proved useful for detecting sites particularly across difficult terrain (Cohen et al. 2020), and AI image recognition techniques have been introduced in archaeology for remote sensing (Verschoof-van der Vaart et al. 2020) and object recognition (Anichini et al. 2021).
However, adopting AI technology for text analysis is more challenging. Language is complex with ambiguities and hidden meaning beyond the pure text structure. NLP has immensely benefited from the integration of LLMs. ML and DL have been applied to, for example, archaeological prediction and detection (Resler et al. 2021) and convolutional neural networks (CNN) have been used to translate old Sumerian and Akkadian cuneiform tablets (Gutherz et al. 2023). Generative AI is helping to recreate the landscapes of the past for more immersive research of the past (Cobb 2023). Big data has been successfully linked in the project 'Unpath'd Waters (Eagles 2022).
A current cultural heritage project applied NLP, in particular Topic Modelling (TM) and ML, to explore the values attributed by people to familiar cultural landscapes (Tenzer 2022; Tenzer and Schofield 2023). Social media data, online surveys, and interviews provided sufficiently large datasets to infer heritage values from a 'bottom-up' or people-centred perspective. TM allows the identification of patterns as themes latent in or emerging from the data, which guarantees an assumption-free approach to empirical data.
AI can also help to deal with the data deluge being experienced by archaeologists (Bevan 2015). The AGNES project facilitates large-scale synthesising research in the Netherlands, by integrating ML into a search engine which aims to index all the texts about archaeology in the region, some 200,000 documents. Specifically, it uses Named Entity Recognition to automatically detect all time periods, artefacts, and place names, which can then be used in search queries. This allows for more exhaustive and precise searches. In a case study on early medieval cremations, researchers found that 30% more cremations were being found in the literature than were previously known (Brandsen and Lippok 2021).
As well as AI-assisted search and TM, recent advances in the application of LLMs in NLP have shown promise in the identification of personally identifiable information (PII) and potential copyright infringements in digital publishing of archival data from modern historical periods. Legislative requirements (including those imposed by the EU's General Data Protection Regulations and extensions of copyright terms) mean that publishers of historical and heritage archives currently need to spend significant amounts of time and manual effort ensuring compliance in these fields. Supporting publishing and editorial teams in this process has significant benefits in terms of both the amount of material that can be digitised and published and in catching cases of infringing content that might have otherwise been missed.
However, as useful as the technology seems to be, it comes with a human and environmental cost. In the next section, we will present the challenges and risks of AI deployment from an ethical and environmental view as a counterbalance to the advantages and opportunities.
The latest AI advancements have given rise to several ethical considerations that warrant thorough examination. In particular, concerns have been raised regarding the transparency of the content and quality of the training data used in AI applications (Bender et al. 2021). These factors have been shown to perpetuate social inequalities (Casilli 2019), propagate misinformation (Wilner 2018), and compromise privacy (Véliz 2021). Furthermore, the use of AI technologies has been linked to instances of racial discrimination (Raji et al. 2020), the endangerment of natural resources, and the exploitation of human labour (Crawford 2021).
Within the discipline, concerns surrounding privacy, bias, and model creation, are critical for formulating policies and decision-making. For instance, AI algorithms in analysing archaeological data could inadvertently lead to biased interpretations of historical events or the reinforcement of existing power structures if the models used are not designed with ethical considerations in mind, specifically, potential harms of fostering a linguistic monoculture, unintentionally strengthening existing power structures, and becoming a monocultural value carrier (Johnson et al. 2022; Pistilli 2022). Since archaeology is also about understanding human history through material remains, language becomes a key component of cultural heritage and identity. If archaeological narratives are dominated by a single language or cultural perspective, this can lead to a skewed understanding of the past, privileging certain histories over others.
There is also a need for explainability and transparency in the approach to data collection in qualitative research. As shown in the heritage case study (Tenzer 2022; Tenzer and Schofield 2023), AI can help analyse vast amounts of social media data or survey responses. However, generating models based on such data can introduce or reinforce biases by excluding already marginalised groups for example. Shaping policies on models trained on such data would introduce these societal inequalities into systems of governance. The public also needs to have the opportunity to opt-out with regard to data privacy, particularly in the cases where vast data sets are scraped or mined from the internet for AI training purposes.
While AI has the potential to analyse vast amounts of data and is particularly good at pattern detection (e.g. Casini et al. 2023), the technology has the potential to replace human volunteers in citizen science projects (Ponti and Seredko 2022). This can lead to a decrease in inclusive and engaging projects within archaeology. Excluding the public from the process of data collection and knowledge creation, and instead reducing their participation to the final product of archaeological investigations, can lead to their alienation from archaeology.
Finally, 'garbage in, garbage out' and 'black box' effects carry the risk of creating new content from already flawed data and in an opaque process (Huggett 2021). Kansteiner (2022) and Clavert and Gensburger (2023) warn about the risk of using ChatGPT to reshape historical narratives:
'If we think that the stories and images we consume influence our memories, identities, and future behaviour, we should be very wary about letting AI craft our future entertainment on the basis of our morally and politically deeply flawed cultural heritage' (Kansteiner 2022, 124)
In the same vein, generative AI technology will take realities of cultural heritage into a new dimension with challenges for authenticity and speculative interpretation in a new era of knowledge production and presentation (Spennemann 2023). A similar effect can be expected in the analysis of large archaeological datasets, shaping a narrative of the past based on weights (parameters in neural networks) in hidden layers (Cobb 2023).
Four key messages around ethical considerations result from these observations:
Recent developments and the rapid adoption of AI technology into archaeology and heritage practice, as presented here, show the need for a debate around ethical implications and sustainable applications of AI. To enable the discourse, we have presented the advantages and capabilities of the applications, which allow more time and resource efficient workflows (Tenzer 2022; Tenzer and Schofield 2023), and enable the analysis and reuse of 'big data' accumulated over decades of archaeological investigations currently dormant in archives and grey literature (Brandsen and Lippok 2021). We also provide different views on the implications of AI applications from archaeology, heritage studies, data science and philosophy, showing inherent challenges regarding limitation, bias and social impact (Bender et al. 2021; Casilli 2019; Crawford 2021; Véliz 2021).
Interdisciplinary/cross-disciplinary research and collaboration will be necessary in the near future to apply this technology to a wide variety of disciplines. Collaboration between data science, sociology, philosophy and archaeology is becoming increasingly important. Understanding how AI technology can influence epistemology and hermeneutics has to focus the discussion on the agency and cognitive artefacts of the technology in view of the output (Huggett 2021, 421). University courses bridging the complex knowledge of the various disciplines will be increasingly necessary. The projects presented here and the collaboration of the authors of this article exemplify how cooperation can work to foster mutually beneficial collaboration.
The discipline also needs to understand how AI deployment will impact on future employment for archaeologists and the changing work environment. What are the prospects for future archaeologists in professional and academic careers? Do we need to become computer scientists ourselves and teach this to our students? Ultimately, will AI replace archaeologists? Harari (2017) argues that there is 'only a 0.7% chance'. AI can replace the monotonous tasks of daily work and carry out the large-scale analyses that precede archaeological work. However, the technology is evolving with increasing speed, and making predictions about the future impact on the profession, especially after the COVID-19 pandemic, is difficult going forward.
AI deployment in the discipline needs to run alongside the development of strategies and best practice guidelines to safeguard the responsible, fair, and sustainable use of this new technology. Exploitation of human and natural resources, and the cost on the environment, needs to be highlighted, and potential risks to reinforce social inequality must be considered.
Archaeology and CHM scholars are well equipped to study and deal with the societal effects of AI, as they already look at large scale influences on society, and have the theories, methods, and background for these analyses. But to do so here, they first need to understand AI methods and their implications.
In post-phenomenological ontology, humans are experiencing the world with and through technology (Gattiglia 2022; Ihde 2009). While we are at a point where machines not only assist humans (first machine revolution), but replace humans in the production or creative workflow (second machine revolution), we need to reorientate and redefine objectives. AI is here to stay, and the question will be how to use it responsibly and sustainably.
This means alignment: where does the technology work towards humanity's values and goals? Where are the dangers and risks of losing control? What are the benefits for society and humanity as a whole (not for the benefit of a few, but for the improvement of the environment, health, and society of the many)? Where does the development go from here? How can AI shape the future of the past - by increasing our understanding of the past, using the vast amount of archaeological and historical data to create material that promotes and conveys this knowledge? Where does the future of the discipline lie regarding cooperation and education?
We are at a point where archaeology and heritage practice cannot just benefit from these technological developments and advances, but must also contribute to the ethical and practical discussion of AI in human culture and society. Coming back to the initial question as to whether AI in archaeology and CHM is a blessing or a curse, we have provided examples of the advantages and beneficial applications of the technology, but have also highlighted the challenges that need to be resolved before AI can be used safely and democratically. The debate is wide open.
This article is part of an AHRC/UKRI WRoCAH-funded PhD project at the University of York. Grant reference number: AH/R012733/1.
Internet Archaeology is an open access journal based in the Department of Archaeology, University of York. Except where otherwise noted, content from this work may be used under the terms of the Creative Commons Attribution 3.0 (CC BY) Unported licence, which permits unrestricted use, distribution, and reproduction in any medium, provided that attribution to the author(s), the title of the work, the Internet Archaeology journal and the relevant URL/DOI are given.
Terms and Conditions | Legal Statements | Privacy Policy | Cookies Policy | Citing Internet Archaeology
Internet Archaeology content is preserved for the long term with the Archaeology Data Service. Help sustain and support open access publication by donating to our Open Access Archaeology Fund.