Predictive modelling for solely management reasons is itself a theoretically questionable aim, with many archaeologists from both areas of the discipline finding the idea that it is possible to manage archaeology without understanding it highly problematic (Gaffney and van Leusen 1995). However, even if we set these concerns aside then the use of correlative predictive models may still be found to be a highly undesirable way to proceed. The main issues might be argued to be that
These claims, obviously, require some enlargement.
There are many methodological problems with the most popular statistical procedures for generating predictive models (see Woodman and Woodward 2002 for an excellent discussion) but the most serious issue is probably that most practitioners make no attempt to find out how well their models actually perform. To do so requires that the predictions of the model be compared with the archaeological resource (or at least an unbiased sample of it) and the only way to do this, of course, is to collect more archaeological data. This represents something of a 'Catch 22' for predictive modelling, because data collection is precisely the activity that most model-builders are usually trying to avoid. Consequently, instead of finding out how well the model predicts undiscovered archaeology, models are evaluated as to how well they predict their own data, and measures such as 'gain statistics' (Kvamme 1992) are offered. These are not measures of the performance of the model, because if it means anything, 'performance' must mean the extent to which the model predicts undiscovered archaeology. Instead, these are measures of the extent to which the model is internally consistent. Gain (and similar) statistics are widely touted as the former, however:
'Another way to assess the performance of a predictive model is to measure its gain in accuracy over a random or null classification'
(Warren and Asch 2000).
The use of these statistics, and attempts to 'pass them off' as performance measures also cannot hide the fact that the gain of most published predictive models is — by any rational estimation — not very good. Regression models typically produce correlation coefficients of 25-30%, or gain statistics around 60-70%. In short, models simply do not perform at a level that is very useful for either explanation or management purposes.
There is little point to developing a model that is not connected to some consequential management action and, in this respect, there are to date very few instances in which development plans or archaeological mitigations have actually been altered on the basis of a statistical prediction of archaeological characteristics. In the case of development control, there is often a need (and sometimes a legal requirement) to look for archaeology on the ground whether the model predicts archaeology or not. This, of course, provides for a strangely biased sample of the archaeological record because we are only looking for archaeological materials where development takes place. It is still probably better than the alternative, which is to actually use the model to decide how we should look for archaeological resources.
If models were actually used — in other words resource management proceeded by (i) generating a predictive model and then (ii) using it to influence where we look for undiscovered archaeology — then we would effectively have created a self-fulfilling sampling strategy. To understand why this is, we need only realise that any model that is based on the known distribution of archaeological sites is actually an embodiment of the visibility, bias and historical accidents that have formed that record. Such a model is therefore predicting the bias in the known record. Using such a model effectively means that we are systematically looking harder for undiscovered sites where we expect to find them (this is shown diagrammatically in Fig. 1). Some practitioners might argue that it is necessary to look in the places where the model does not predict archaeology as well as where it does, but it remains true that any management outcome that leads archaeologists to look harder or more frequently in those locations where the model predicts archaeology is a self-fulfilling feedback system that will lead to an increasingly unrepresentative archaeological record (Fig. 2).
Figure 1: A positive feedback in which a biased model causes archaeologists to look more closely at areas already identified as having more sites, thus reinforcing the bias in the original model
In the most extreme case, archaeologists would no longer bother to look for new archaeological sites in those locations where the model predicted zero probability of undiscovered archaeology, effectively creating a model with no potential to revise itself.
Figure 2: The feedback loop repeats itself through time, ensuring that each iteration of a predictive model leads to an even more unrepresentative database of archaeological materials
Archaeology should really face up to the possibility that useful, correlative predictive modelling will never work because archaeological landscapes are too complex or, to put it another way, too interesting. It is obviously unrealistic for financial reasons to expect archaeological investigations to be done everywhere, but generating correlative models that do not work and should not be used is not the answer to the dilemma of how best to deploy scarce archaeological effort. This is undeniably a very negative conclusion to reach and it would be reasonable to expect that some more positive suggestions should accompany it: if predictive modelling is of no value in helping us address a real concern within resource management, then what is?
To answer this, we should consider the functional requirement for building a model in archaeological resource management. The reason most often cited for its use is that there are insufficient financial resources to conduct detailed archaeological work everywhere and, given this, predictive modelling is an attractive solution. However, it has been argued above that correlative predictive modelling does not actually work very well and, more significantly, will lead to an increasingly unrepresentative archaeological record. If resource management requires a methodology that does work and will lead to a more representative record, then it follows from this that archaeology would be better served by a focus on well-designed and properly implemented sampling strategies, rather than correlative predictive models.
© Internet Archaeology
URL: http://intarch.ac.uk/journal/issue15/10/dw5.html
Last updated: Wed 28 Jan 2004