Modelling medieval masonry construction: taxa-specific and habitat-contingent Bayesian techniques for the interpretation of radiocarbon data from Mortar-Entrapped Relict Limekiln Fuels

Using data from simulated and actual case studies, this paper assesses the accuracy and precision of Bayesian estimates for the constructional date of medieval masonry buildings, generated from the radiocarbon evidence returned by different assemblages of wood-charcoal mortar-entrapped relict limekiln fuel (MERLF). The results from two theoretical studies demonstrate how Bayesian model specifications can be varied to generate a chronologically continuous spectrum of distributions from radiocarbon datasets subject Inbuilt Age (IA). Further analysis suggests that the potential for these distributions to contain the date of the constructional event depends largely upon the accuracy of the latest radiocarbon determination within each dataset, while precision is predicated on dataset age range, dataset size and model specification. These theoretical studies inform revised approaches to the radiocarbon evidence emerging from six culturally important Scottish medieval masonry buildings, each of which is associated with a wood-charcoal MERLF assemblage of different botanical character. The Bayesian estimates generated from these radiocarbon datasets are remarkably consistent with the historical and archaeological evidence currently associated with these sites, while age range distributions suggest the IA of each MERLF assemblage has been constrained by the taxa-specific and environmentally contingent lifespans and post-mortem durabilities of the limekiln fuel source. These studies provide further evidence that Bayesian techniques can generate consistently accurate chronological estimates for the construction of medieval masonry buildings from MERLF radiocarbon data, whatever the ecological provenance of the limekiln fuel source. Estimate precision is contingent upon source ecology and craft technique but can be increased by a more informed approach to materials analysis and interpretation.


Introduction
Scholars across the world often face significant challenges in ascribing constructional dates to masonry buildings with enough precision to enable meaningful interdisciplinary environmental or historical discourse.
The independent evidence returned by radiocarbon analysis of mortar-entrapped relict limekiln fuel (MERLF) fragments can usefully inform these chronological ascriptions, and the sedimentary context within which these materials survive presents some valuable characteristics to facilitate interpretation of that data. Firstly, the durability of lime mortars can allow them to survive in-situ for hundreds and sometimes thousands of years in upstanding masonry contexts, even if the walls within Open Access *Correspondence: mark.thacker@stir.ac.uk University of Stirling, Stirling, UK which they were deposited have been incorporated in later buildings or the structure has become ruinous [1][2][3]. Secondly, mortar compositions are historically and environmentally contingent, and material contrasts noted through field survey and lab-based analyses can inform relative phasing interpretations, even where direct stratigraphic relationships are absent (e. g. [4,5]). And thirdly, the widespread use of wood fuel in pre-industrial limeburning practices has often resulted high concentrations of wood-charcoal MERLF fragment inclusions, and this material has well-recognised radiocarbon dating potential [6,7]. These characteristics of durability, distinctiveness, relative ubiquity, and radiocarbon dating potential are all underpinned by a rapid post-depositional mortar set, which not only allows an unequivocal archaeological association between the mortar's constituent components and the surrounding masonry fabric [8], but also precludes infiltration by other materials in later periods. It is crucial for wider interpretation that mortar phasing is correctly understood, and all radiocarbon measurements are accurate [9], but thereafter the determinations returned by all wood-charcoal relict limekiln fuel fragments are expected to calibrate to dates which are no later than the initial deposition of the masonry mortar within which they were entrapped [10][11][12][13].
The extent to which these radiocarbon determinations might calibrate to periods which pre-date the construction of the identified masonry phase will be defined by various interrelated factors. The size of the laboratory measurement error margin and the section of the atmospheric calibration curve to which the determination relates are important factors in defining chronological precision, but an allowance must also be made for any 'inbuilt age' (IA) which might separate the average age of the annual tree rings in the MERLF sample (the cells of which rapidly cease exchanging carbon with the surrounding environment after formation) from the constructional event of interest [14][15][16][17][18]. The 'bridging periods' [14] which might contribute to the IA of a sample must be considered on a context-by-context basis and might include post-limekiln factors such as lime transport, mortar maturation and building construction times. But in the absence of historic evidence for the use of pre-prepared charcoal to fuel limekilns [19], the most significant bridging periods are likely to be associated with the loss of outer tree rings from the sample during post-mortem rot, wood conversion, combustion, and/or mortar mixing: periods limited by the pre-kiln growth and storage ages of the selected wood fuels and essentially defined by ecological parameters [7].
The potential IA of every surviving MERLF sample is influenced by its taxa-specific and habitat-contingent lifespan (constraining growth age), habit (influencing trunk and branch wood availability), and post-mortem resilience (constraining storage age). These factors are closely interrelated, since higher resource environments and faster growing taxa are almost universally associated with relatively short lifespan plants characterised by low density wood with poor post-mortem resistance to wood-destroying fungi, and low resource environments and slower growing taxa are widely associated with longer lifespan plants characterised by higher density wood with better post-mortem resistance to wooddestroying fungi [20][21][22][23][24][25]. In general, therefore, an inverse correlation between the growth-rate of the wood fuel source and the potential IA of the MERLF assemblage generally pertains, and this is determined by taxon and habitat.
These relationships have implications for the archaeological resource at different ecological scales. At the broadest scale, IA potential may be determined by the class of the parent tree and the character of the surrounding biome, since gymnosperms are often characterised by extremely long lifespans [26] and boreal forests can be associated with (post-mortem) coarse woody debris many hundreds of years old [27,28]. Cold, sheltered and phytogeographically marginal environments are also associated with increased tree longevity in temperate woodlands although, with maximum tree lifespans of 3-400 years pertaining across many old-growth northern hemisphere deciduous forests [29], dendroecological meta-data suggests this is generally more limited than records of particularly (probably very slow-growing) ancient individuals might suggest. Even within this reduced range, however, inter-species contrasts in lifespan and post-mortem resilience can be considerable. In the Atlantic-influenced environments of the UK, dense and comparatively slow-growing angiosperms such as Quercus sp. mature at around 150 years old, can live to over 500 years, and demonstrate high post-mortem durability; while faster-growing shorter-lived genera such as Betula sp. generally mature at around 60 years old, rarely reach 100 years old, and are very rapidly destroyed by fungal attack after death [30][31][32]. At a finer scale, field reports suggest Betula pendula has a much longer lifespan in the colder climates of central Scotland than elsewhere in these islands [31], and yet the lifespan of 'self-coppicing' Corylus avellana stems (which generally extend to 30-50 years in England) [30], can be limited to 12-15 years on some thin western Scottish soils [33]. Consideration of woodland ecologies in the immediate locality is often useful for evaluation of IA potential, therefore, and particularly in a country such Scotland which is crossed by numerous Northern European phytogeographical boundaries [34,35].
Scale of analysis is also important for our interpretations of how the carbon might be distributed within a potential IA range. The distribution of carbon within a woodland exploited for lime-burning fuel will be determined by a complex mix of historically-contingent ecological processes, but although the IA potential of an even-aged stand will tend to increase over time as the population matures and productivity will naturally decline in later years as mortality increases, meta-data gathered from long-term monitoring projects across the globe suggests that aboveground mass growth rates (and so carbon gain) in individual trees generally increases continuously throughout their lifespans [24,[36][37][38]. Indeed, it has been reported that only 6% of old-growth forest in the western USA is comprised of trees with trunks of over 100 cm diameter, and yet this small population contributes a remarkably high '33% of the annual forest mass growth' [36]. This allometric evidence is consistent with reports that more than 50% of a tree stem's volume typically derives from the outer 30% of its annual rings [39], and its average radiocarbon age is likely to be approximately a third of its overall lifespan [10] (Fig. 1), while the destruction of old wood during trunk hollow formation [40] and replacement of older branches [41] is likely to sharpen carbon distributions in mature trees still further. Providing the assemblage is representative of the woodland source, therefore, ecological parameters suggest most wood-charcoal samples are likely to calibrate to dates which are equal to or only slightly earlier than their date of deposition [13,17,42].
Quantifying the relationship between these radiocarbon determinations and the constructional date of the building from which they were removed is particularly important in historic contexts where wider political, cultural, and environmental processes are understood with reasonably high chronological precision. That the IA of a wood-charcoal fragment is contingent on its botanical character has long been recognised, and half a century ago Tjalling Waterbolk (1971) suggested samples could be usefully divided into three groups whereby: Group A materials deriving from twigs and outermost tree rings would present a radiocarbon 'time difference' which was negligible (< c. 20 years); Group B materials deriving from short-lifespan wood species would present time differences measurable in decades (20-100 years); while Group C materials deriving from longer lifespan species might result in time differences of over a century [10. See also 11]. 1 Waterbolk (1971) also acknowledged that the archaeological context from which a sample was recovered might suggest an association with a lower IA Group, and proposed that charcoal fragments from hearths or ovens were likely to be narrowly distributed since 'firewood for daily consumption would have consisted mostly of very young wood' [10], 2 although the historiography of MERLF materials research reveals contrasting approaches to this issue. Rainer Berger's (1992; analysis of materials from various pre-Romanesque chapel sites in Ireland, for example, began from a premise that short-lived wood was deliberately selected for limeburning fuel and the calibrated radiocarbon determinations returned by single (probably bulk) MERLF samples were thereby reported as direct standalone constructional estimates [43,44]. In clear contrast, a re-interpretation of large radiocarbon datasets from masonry buildings in Africa and Asia began from a premise that storage age and re-use in individual wood-charcoal fragments were 'indeterminable' chronological factors, whose interpretation required analysis of multiple samples from each phase [17,12,45]. 3,4 Notably, this latter assertion followed an article by Patrick Ashmore (1999) which highlighted the dangers of bulking multiple charcoal fragments in a single sample and the challenges of establishing charcoal residence times on archaeological Fig. 1 Stem analysis and radiocarbon dating of annual rings from a theoretical 100-year-old tree displaying exponential growth and felled in c.1250 AD. Background image after Biondi 2020 [25] sites [18]. But in both instances highlighted above, the radiocarbon distribution of each sample assemblage is not expected to reflect the ecology of a natural woodland source (both are essentially defined by anthropological or technical processes), and there is a concomitant lack of botanical information on the character of the materials under consideration (see also [46][47][48]).
Ultimately, where the radiocarbon determination returned by a MERLF sample might be associated with significant IA, then circumscription of the chronological period within which the constructional event took place requires a comparative approach. Indeed, multidisciplinary approaches to MERLF radiocarbon data can very usefully bracket the period of building construction where historical or archaeological evidence post-dating that event is also available, since these are effective terminus post quem (TPQ) and terminus ante quem (TAQ) dates, and where the constructional date is already known with some precision then the IA of the MERLF sample or assemblage can be closely quantified [49]. Where multiple radiocarbon determinations are available, however, then Bayesian and other statistical techniques can be used to generate a comparative 'standalone' constructional estimate from the radiocarbon and phasing evidence alone -complete with upper limits which are independent of historical evidence [7]. Once again, such statistical interpretations are generally predicated on a binary distinction between short or long-lived organic materials, although that is most often defined by the distribution of the determinations within the radiocarbon dataset, rather than the botanical character of the samples within the charcoal assemblage. Where samples deposited in a single event have returned determinations so narrowly distributed that a chi-square type test suggests the dataset is statistically consistent at 5% significance, then the samples can be regarded as effectively contemporaneous, and a more precise Combine average can be generated which is assumed to directly represent their date of deposition [50]. Where determinations from a particular depositional context are not statistically consistent at 5% significance, however, then a higher level of IA must be suspected, and a different approach is required. In the OxCal calibration programme used throughout this paper, measurements from datasets subject to IA can be grouped into model phases framed by probability distributions known as Start and End Boundaries; and the position of this latter event at the end of the phase (or between phases in a multiphase scenario) may be accepted as a reasonable estimate for the completion of the constructional event [51]. Generating a Last distribution will also provide a probability estimate for the last determination within the series, however, and these measurements can be further constrained to reflect our prior belief that the dataset should be exponentially distributed-in line with allometric data. OxCal offers two main methods by which this might be achieved: a Tau Start Boundary can be selected to impose an exponential 'prior' distribution on the whole phase; or each individual determination can be tagged with an Outlier Probability linked to a separate exponentially distributed Charcoal Outlier Model [13]. The default Charcoal Outlier Model in OxCal is specified with a 1000 year time-constant to encompass the mean lifespan of an extremely long-lived assemblage, but the logarithmic scale of the model is defined by the actual distribution of the radiocarbon determinations in each dataset and, providing sufficient independent determinations are available, the lowest IA materials are expected to be steeply distributed very close to the exponential asymptote [13,17].
The assumptions which underpin these interpretive schemes and their general application for MERLF analysis are open to challenge. Ashmore's [18] thesis demonstrated that very short-lived charcoal materials retrieved from various excavated contexts have sometimes returned problematically early radiocarbon determinations, but the storage age potential of MERLF materials is likely to be more limited where wood (rather than charcoal) has routinely been used as a limekiln fuel, and long residence times are largely irrelevant in a mortared masonry context where intrusion is effectively precluded. Recent studies have reported evidence that bark evidence does occasionally survive on wood-charcoal MERLF fragments, and some of the assemblages associated with these fragments have also returned radiocarbon determinations which are statistically consistent at 5% significance [52], but historical, archaeological, and radiocarbon evidence from across northern Europe and elsewhere suggests that limekilns were often charged with mixtures of different wood taxa which can return statistically inconsistent radiocarbon datasets [7]. In an important piece of work evaluating the accuracy and precision of 'standalone' constructional estimates generated using Bayesian techniques in OxCal, Michael Dee and Christopher Bronk Ramsey (2014) concluded that the Charcoal Outlier Model approach generates the most consistently accurate End Boundary estimates from wood-charcoal datasets, whilst the exponential prior approach generated more precise but occasionally inaccurate distributions [17]. Where the number of determinations from a particular phase is more limited, however, then the Charcoal Outlier approach tends to generate very broad positively skewed End Boundary distributions, which can seem incongruous against the comparatively low mean lifespans of most temperate woodland taxa. The TPQ role performed by wood-charcoal MERLF radiocarbon determinations can be of huge value for multidisciplinary interpretation and the effect of model selection on upper limits may be much less important where an early and convincing TAQ is available to truncate these distributions ( Fig. 2a-f ). Standalone estimate precision, however, is vital for increased interdisciplinary (rather than multidisciplinary) discourse. And while there is also some evidence that reducing the exponential time-constant of the Charcoal Outlier Model to reflect the more limited source material lifespans can constrain End Boundary distributions [49], persistent contrasts in precision with estimates using the exponential prior approach and binary approaches to statistical consistency do not appear to reflect a continuous spectrum of potential IAs predicated on variation in woodland ecologies. Importantly for this paper, however, the accuracy and precision of constructional estimates generated using different Bayesian approaches can also be evaluated using simulated datasets in theoretical models, without initial reference to architectural or historical evidence.
This paper describes a re-evaluation of these Bayesian frameworks, with a concern to further characterise statistical relationships which might pertain between the different limekiln fuel resources exploited during the construction of medieval masonry buildings, and the archaeological potential of any surviving MERLF materials. Following the approach developed by Dee and Bronk Ramsey [17], this work is predicated on Bayesian analysis of simulated and actual radiocarbon datasets subject to varying levels of IA, although a broader range of simulated single-phase datasets, model specifications and generated estimates are considered here. Indeed, two theoretical studies centred on a single date in the medieval period will demonstrate how a chronologically continuous spectrum of distributions can be generated from radiocarbon datasets of varying lifespans and sizes, and different Bayesian model specifications can therefore be employed to maximise constructional estimate precision whilst retaining accuracy. These theoretical results will then inform the modelling approaches applied to the published radiocarbon datasets from six culturally important Scottish medieval buildings (CS1-6), each of which is associated with a MERLF assemblage of contrasting botanical character. Highlighting remarkable levels of consistency between the radiocarbon, historical and ecological data, the paper will conclude that the age range of an in-situ MERLF assemblage does often appear to be constrained by the taxa-specific and environmentally contingent lifespans and post-mortem durabilities of the limekiln fuel source.

Method
The methodologies presented below detail the processes by which data were generated in two theoretical studies and six case studies. Following previous authors [17], the theoretical studies are essentially circular and pragmatic. Exponentially distributed sets of calibrated radiocarbon dates associated with different IA mean lifespans (hereafter IAτ) have been simulated from a particular 'true event' calendar date, and these datasets have then been constrained within various single-phase Bayesian models to assess how the accuracy and precision of newly generated distributions are affected by model specification. In contrast to that previous work, however, a calendar date of 1250 AD has been selected for the true event in each theoretical study, since this correlates with a relatively monotonic section of the radiocarbon calibration curve and occupies a central chronological position relative to the case studies presented later in the paper. The error margin of ± 35 years on each simulated date was also selected to more closely reflect the data associated with these case studies. An increased range of dataset IAτ and model specifications has also been employed, while End Boundary and Last distributions are evaluated for constructional estimate accuracy and precision. All datasets and models have been generated using OxCal 4.4 [51] and are calibrated with the IntCal20 atmospheric calibration curve [53].

Theoretical studies
In Theoretical Study 1 (TS1), multiple sets of twenty simulated calibrated dates subject to varying levels of IA were generated from a theoretical true event date of 700 BP ± 35 years (1250 AD). Twenty independent datasets were generated in TS1, with four separate datasets each subject to IAτ specified at 10, 50, 100, 200 and 500 years. The number of dates in each dataset which included the true event date has been counted, dataset age ranges have been estimated using the OxCal Difference function to compare the earliest and latest simulated dates [51], and the actual mean lifespan of each dataset has been calculated by finding the sum of the mean values from each individual simulated date. Each dataset has then been situated in a single-phase Bayesian model framed by Start and End Boundaries and subject to a range of different specifications. Run at default Oxcal settings to allow a reasonably fast turnaround of results, these include: (i) a Combine model; (ii) an exponential prior/no outlier model; (iii) an exponential prior/modified Charcoal Outlier Model (with a time-constant modified to the same scale as the IAτ specified for the simulated dataset); (iv) an exponential prior/default Charcoal Outlier Model; (v) a uniform prior/no outlier model; (vi) a uniform prior/

Fig. 2 a-f
Comparative End Boundary distributions generated from single-phase MERLF radiocarbon data associated with a recent study of Achanduin Castle [49] highlighting the effect of different outlier models on standalone and multidisciplinary constructional estimate precision. In standalone model 1a, an exponential prior was applied, and all three available radiocarbon determinations were tagged with a 5% Outlier Probability in the default OxCal General Outlier Model (subfigure a); in standalone model 2a, a uniform prior was applied, and all three available radiocarbon determinations were tagged with a 100% Outlier Probability in a Charcoal Outlier Model with a time-constant modified to 100 years (subfigure c); and in standalone model 3a, a uniform prior was applied, and all three radiocarbon determinations were tagged with a 100% Outlier Probability in the default Charcoal Outlier Model (subfigure e). Multidisciplinary models 1b, 2b and 3b are a development from 1a, 2a and 3a, and each includes a 1310 AD documentary TAQ (subfigures b, d and f). All models have been updated using OxCal v4.4 [51] and the IntCal20 atmospheric curve [53], with results rounded out to 5 years. X axis scales in these plots have been standardised to facilitate visual comparison, but note also the different y-axis probability density scales, as well as the contrasting End Boundary distributions and medians modified Charcoal Outlier Model (with a time-constant modified to the same scale as the specified IAτ of the simulated dataset); and (vii) a uniform prior/default Charcoal Outlier Model. In the third and fourth run of each dataset/modelling approach combination, a Last distribution was also generated. All models associated with TS1 are presented in Additional file 1.
In Theoretical Study 2 (TS2), multiple simulated datasets of varying size and subject to varying levels of IA were generated from a theoretical true event date of 700BP ± 35 years. Forty-five independent datasets were generated in TS2 with three separate datasets associated with an IAτ specified to 10, 50, 100, 200 and 500 years and including fifteen, ten and five simulated dates. As in TS1, the number of dates in each TS2 dataset which included the true event date has been counted, the age range has been estimated using the OxCal Difference function to compare the earliest and latest simulated dates, and the actual mean lifespan of each dataset has been calculated by finding the sum of the mean values from each simulated date. These datasets were included in single-phase models framed by Start and End Boundaries and subject to three model specifications including: (i) an exponential prior/no outlier model, (ii) an exponential prior/modified Charcoal Outlier Model (with a time-constant modified to match the IAτ specified for the simulated dataset), and (iii) a uniform prior/default Charcoal Outlier Model. All these TS2 models have been run at default settings to allow a reasonably fast turnaround of results, and all include a Last distribution. All models associated with TS2 are presented in Additional file 2.
In Case Sudies 1-6 (CS1-6), the radiocarbon data from six Scottish medieval buildings with wood-charcoal MERLF assemblages comprised of contrasting taxa are re-evaluated. This includes Castle Fincharn main block (CS1), Aros Castle north-west block (CS2), Castle Roy enclosure and tower (CS3), Lochindorb Castle primary enclosure (CS4), Achanduin castle enclosure and hall (CS5), and Lismore Cathedral nave (CS6). The MERLF assemblages associated with the first five of these studies are dominated by charcoal fragments which displayed no surviving terminal ring, bark, or sapwood boundary evidence, and each has been published elsewhere in some detail. Full details of CS6 await publication but is included here since the MERLF assemblage included a Corylus sp. fragment with some terminal ring evidence. The distributions of each of these radiocarbon datasets was investigated using the Ward and Wilson (1978) chisquare type test [50], and an age range was calculated using the Difference function to compare the earliest and latest dates available [51]. The data from each site were then included in a series of single-phase Bayesian models. Run at 1 year resolution and 20,000 Kiterations, these include: a Combine model; an exponential prior/no outlier model; an exponential prior/modified Charcoal Outlier Model; an exponential prior/default Charcoal Outlier Model; and a uniform prior/default Charcoal Outlier Model. The modified Charcoal Outlier Model IA timeconstants specified in CS1-6 have been estimated from published data regarding tree mean lifespan and wood post-mortem resilience data (Table 1). Mean lifespans are rarely reported so working values have been calculated at 33% of reported maximum lifespans, in line with allometric data, added to which an estimate of post-mortem resilience has been derived from the resistance to wooddestroying fungi according to the (1-5) durability scale applied by British and European Standards [54] and other published reports (Table 1). Where the datasets returned by mixed-taxa assemblages are statistically consistent at 5% significance (CS5) then these are tagged to a single Charcoal Outlier Model with a time-constant modified to reflect the lowest IAτ samples, and where these datasets are not statistically consistent at 5% significance (CS2 and CS6) then the highest IAτ data is used. A Last distribution has been generated in all case study models, and the Last and End Boundary distributions compared with various potential TPQ and TAQ dates (from other types of historical, archaeological or architectural evidence) using the Order function [51]. All models associated with these case studies are presented in Additional file 3. In the interest of brevity, presentation of wider evidence relating to these buildings is kept to a minimum, and readers are encouraged to follow the cited references for more detailed information.
Calibrated date ranges in each theoretical and case study radiocarbon dataset are expressed as cal AD or cal BC at 95% and 68% confidence using upright text and have been rounded out to 10 years [12]. 5 Modelled age range, End Boundary, and Last distributions are reported as Highest Posterior Density (HPD) interval date ranges at both 68% and 95% probability with median values, and these estimates have been rounded out to 5 years and are presented in italics. Generated date ranges in both theoretical studies are regarded as accurate when they include the true event date from which each simulated dataset was generated (i.e. 1250 cal AD). The agreement indices returned by each model are considered [51,55], but individual measurements which fall below the accepted 60% threshold of compatibility have not been removed from theoretical models since to do so would bias results [56].

Theoretical Study 1 (TS1)
The twenty simulated datasets generated in TS1 are all exponentially distributed (e.g. Figure 3). There is relatively little variation between the latest simulated date in each dataset, and all include the 1250 AD true event at 95% confidence (Table 2). Increased dataset IAτ is associated with some decrease in latest date age, however, and with a decrease in the number of dates containing the true event at 95% confidence (from all 20 dates in 10 years IAτ dataset M1a run 1, to only a single date in 500 years IAτ dataset M1e run 4). An average of 19 simulated dates include the true event in 10 year IAτ datasets, 13 in 50 year IAτ datasets, 11.5 in 100 year IAτ datasets, 5 in 200 year IAτ datasets (M1d), and 2.5 in 500 year IAτ datasets.
Age range variation in the TS1 datasets is mostly predicated on variation in the earliest simulated dates within each dataset (Table 2). Increased IAτ is generally associated with an increase in earliest date age, an increase in date range, and an increase in date range variation. The earliest calibrated dates in all the datasets specified with 10 years IAτ are situated in the second millennium AD and age range variation in this group is limited to between -35-225 years (M1a run 1) and 20-250 years (M1a run 2). This 10 years IAτ group includes the only dataset in TS1 which falls into minus values, and this is the same dataset in which all simulated dates include the 1250 AD true event. In contrast, all earliest dates in the datasets specified with 500 years IAτ are situated in the cal BC period, with age range variation between 1315-1640 years (M1e run 3) and 3205-3465 years (M1e run 1). Dataset age ranges in these two lowest and highest IAτ groups are distinctive, and the datasets throughout the study are generally consistent with this trend, but there is considerable overlap between individual datasets in adjacent 50, 100 and 200 years IAτ groups.
All dataset mean lifespans are earlier than the 1250 AD true event date in TS1 and these values also generally increase in age and variation with increased IAτ specification ( Table 2). The mean lifespans of the datasets specified with 10 years IAτ are narrowly distributed between 1224 and 1242 cal AD, whilst the datasets specified with 500 years IAτ present mean lifespans situated in the first millennium AD between 491 and 780 cal AD. The average lifespans within each group are all consistent with this trend and close to expected values-1229 cal AD (10 years IAτ specified), 1183 cal AD (50 years IAτ specified), 1156 cal AD (100 years IAτ specified), 1039 cal AD (200 years IAτ specified), and 670 cal AD (500 years IAτ specified)-although there is some overlap in the lifespans of individual datasets in adjacent groups specified to 50, 100 and 200 years IAτ.
There is a clear relationship between the IAτ specified, the distribution of simulated dates, and the date range of each dataset generated in TS1. All four datasets in the 10 years IAτ group (M1a runs 1-4) pass the Ward and Wilson (1978) chi-square type test and have generated accurate Combined dates, although three of these models contain four individual dates with low agreement indices  3 Selected multiple plots from TS1, illustrating exponentially distributed simulated datasets with specified IAτ of 10 years and 500 years. Subfigure a presents distribution plots from a dataset specified with 10 years IAτ (M1a run 2) with simulated dates ranging from 1040-1220 cal AD (M3a) to 1220-1380 cal AD (M4a) at 95% confidence, nineteen of which include the true event date of 1250 AD at 95% confidence. Subfigure b presents distribution plots from a dataset specified with 500 years IAτ (M1e run 1), with simulated dates ranging from 2270 to 1980 cal BC (M14e) to 1160-1280 cal AD (M12e) at 95% confidence, two of which include the 1250 AD true event date at 95% confidence. Small circles represent the mean average of each distribution (Ai ≤ 60%) and present low overall Combined Agreement Indices (Acomb) ( Table 3). The exception is the 10 years IAτ dataset in which all 20 simulated dates include the true event date (M1a run 1), which has a mean lifespan of 1242 cal AD (the highest in the study and very close to the 1250 AD true event) and has only returned one low Ai. All sixteen simulated datasets specified with 50, 100, 200 and 500 years IAτ fail the Ward and Wilson (1978) test. The four datasets specified to 50 years IAτ (M1b runs 1-4) and one specified to 100 years IAτ (M1c run 1) have generated Combined date ranges, although these are all too early. The three remaining models associated with datasets specified to 100 years IAτ (M1c runs 2-4) and all models specified with 200 and 500 years IAτ datasets have failed to generate a Combined distribution at all.
Out-with the Combine models, all TS1 models except one present overall agreement indices which are above the 60% threshold ( Table 4). The exception is associated with a uniform prior/no outlier modelling approach to a 50-year IAτ dataset (M1b, run 4), which presents an Overall Agreement Index of 58.9%. The number of individual dates with low Agreement Indices decreases with increasing IAτ, and with models employing an exponential prior specification. All three uniform prior approaches generate higher numbers of individual low Agreement Index distributions than their exponential counterparts from 10 and 50 years IAτ datasets. The uniform prior/no outlier approach contains the most low Agreement Index distributions in TS1, with a maximum of three presented from a 10 year IAτ dataset (M1a run 3). Individual low Agreement Indices are limited to a single dates in all exponential prior modelling approaches ( Table 4).
98% of all TS1 models (118/120) have generated End Boundary HPD intervals which are accurate at 95% probability, and 85% (102/120) are also accurate at 68% probability ( Table 5). Consistency of End Boundary accuracy at 68% probability is inversely proportional to dataset IAτ; decreasing from 96% of models associated with datasets specified with 10 years IAτ to 58% of models associated with datasets specified with 500 years IAτ. Consistency of End Boundary accuracy also varies with model specification and a broad correlation with prior distribution Notably, all models employing an exponential prior have generated accurate End Boundary HPD intervals at both 95% and 68% probability from all datasets specified with 10, 50 and 100 years IAτ (Table 5). Inaccurate End Boundary distributions at 68% probability in TS1 can be either earlier or later than the true event ( Table 4). All inaccurate estimates generated by uniform prior models are late, including both of those which are inaccurate at 95% probability, whilst exponential prior models have generated End Boundary HPD intervals which are both too early and too late at 68% probability. The exponential prior/no outlier modelling approach to 50 years IAτ and 200 years IAτ datasets are the only two modelling-dataset combinations in TS1 which have generated End Boundary average medians that are earlier than the true event date of 1250 AD.
All Last HPD intervals generated in TS1 are accurate at 95% probability (Table 6), although this would not have been the case had Last distributions been generated in runs 1 and 2. Last HPD interval accuracy at 68% probability is inversely proportional to the specified dataset IAτ; and falls sharply from 100% accuracy in models associated with 10 years and 50 years IAτ datasets, to 58% accuracy in those generated from datasets specified to 500 years IAτ. Most model specifications have generated accurate Last HPD intervals at 68% probability from 90% of all datasets. This is reduced to 80% for the uniform prior/default Charcoal Outlier Model approach, although this difference relates to one extra inaccurate date only.
End Boundary precision in TS1 is inversely proportional to specified dataset IAτ for all modelling approaches (Table 7). Overall, End Boundary median   age is also inversely proportional to dataset IAτ across the study, although that trend is much clearer in models with uniform priors (Table 8). End Boundary precision and median age also vary with model specification; such that the imposition of an exponential prior distribution generally increases relative End Boundary precision and   (Tables 7 and 8). These effects are cumulative; such that variation in model specification has a much more significant on End Boundary precision and median age when associated with higher IAτ datasets. Last HPD interval precision in TS1 is also inversely proportional to dataset IAτ for all modelling approaches (Table 9). Last precision also varies according to model 10 years   (Table 10).

Theoretical Study 2 (TS2)
Most TS2 datasets present convincingly exponential distributions, and 91% (41/45) contain at least one simulated date which includes the true event (Table 11). Increased IAτ specification in this study has generally resulted in datasets with earlier latest dates, a decreased number of dates which include the true event, increased age ranges, increased age range variation, earlier earliest dates, and earlier mean lifespans. No latest dates in TS2 are later than the true event at 95% probability. Decreased dataset size has also resulted in earlier latest dates, however, and at very high IAτ this can result in no true dates at all. There is no clear correlation between dataset size and percentage of true dates or mean lifespans for a given IAτ across the study, but earliest dates tend to decrease in age with decreased dataset size and thereby age range and age variation also decrease (since this is mostly driven by the early dates) (Fig. 4a-c). The inverse correlation between dataset IAτ and the age of the latest date is exaggerated by dataset size. There is a strong correlation in TS2 between the specified IAτ and the number of accurate dates in each dataset, with 94-98% of all determinations including the true event date at 95% confidence in datasets specified with 10 years IAτ, and 67% (6/9) of these 10 years IAτ datasets are completely dominated by such accurate simulated dates in all three dataset sizes. At the other end of the IAτ spectrum, only 6-13% of datasets specified with 500 years IAτ are dates which include the true event at    Age range and age range variation across TS2 generally increases with increased IAτ, and with dataset size within each IAτ group (Table 11). Narrowly distributed averages of − 5-225 years (MRRR1a) to 10-240 years (MR1a) are presented in datasets specified to 10 years IAτ, and much more widely distributed averages of 1040-1350 years (MRRR1e) to 2225-2490 years (MR1e) are presented in datasets specified to 500 years IAτ. The average age ranges presented by the 50, 100 and 200 years IAτ datasets are consistent with this trend, although there is some overlap between individual datasets from all these adjacent groups. The 5 date 100 years IAτ group is particularly notable since this contains two datasets with extraordinarily narrow age ranges of − 10-225 years (MRRR1c run 2) and 10-225 years (MRRR1c run 1), but also extends up to 450-625 years (MR1c run 2). The narrowest age ranges are presented by 5 date datasets in four of the five IAτ groups, whilst the widest age ranges are presented by 15 date datasets in three of the five IAτ groups. Age range variation is mostly predicated on the earliest simulated date in each dataset, which generally increase in age with increased IAτ specification and dataset size; varying from 1053-1273 cal AD in the 10 years IAτ dataset (MRR1a, run 2), to 1416-1224 cal BC in the 500 years IAτ dataset (MR1e, run 2). Where datasets include very low numbers of dates, and few which include the true event date, then the exponential distribution becomes less visible and more sigmoidal distributions are apparent (Fig. 4a-c).
Mean lifespans are all earlier than the true event date and generally increase in age with the increased IAτ specification and age range in TS2 (Table 11). Without rounding out these are all close to expected values. Depending on dataset size, the study presents average mean lifespans of: 1211-1231 cal AD in the group specified with 10 years IAτ (average 1223 rather than 1240 AD); 1185-1199 cal AD in the group specified with 50 years IAτ (1191 rather than 1200 AD); 1136-1150 cal AD in the group specified with 100 years IAτ (1141 rather than 1150 AD); 1000-1076 cal AD in the group specified with 200 years IAτ (1041 rather than 1050 AD); and 699-790 cal AD in the group specified with 500 years IAτ (717 rather than 750 AD). Average mean lifespans for each level of specified IAτ are all distinct from adjacent groups although, apart from the 500 years IAτ, there is some overlap between All TS2 models have returned Overall Agreement Indices that are above the 60% threshold (Table 12). 12% (16/135) of these models contain at least one simulated date with a low Agreement Index (Ai) and 4% (6/135) contain more than one such date. Individual dates with low Agreement Indices are more strongly associated with lower IAτ and larger datasets, with no instances in the 500 years IAτ or 5 date dataset groups. All six models associated with more than a single low Ai date are associated with two 15 date datasets specified with 10 years IAτ (MR1a, runs 2 and 3, all three model specifications).
99% of all models in TS2 (133/135) have generated accurate End Boundary HPD intervals at 95% probability, and 84% (113/135) are also accurate at 68% probability (Table 13). Consistency of End Boundary accuracy varies across the study with dataset IAτ, model specification, and dataset size although these relationships are not straightforward. The lowest IAτ datasets have produced the most consistently accurate estimates at both 95% and 68%, but there is no clear overall trend in the relationship between these variables in the rest of the study: all End Boundary HPD intervals generated from datasets specified with 10 years IAτ are accurate at both 68% and 95% probability; all models associated with the 50 year IAτ datasets are also accurate at 95% probability, but only 7O% of these estimates are accurate at 68% probability; 78% of models associated with the 100 years and 200 years IAτ groups have returned accurate estimates at 68% probability, but accuracy at 95% probability is down to 96% in both cases; while the 500 years IAτ group has returned 100% accuracy at 95% probability and 93% accuracy at 68% probability. The exponential prior/modified Charcoal Outlier Model approach has presented the most consistently accurate End Boundary HPD intervals across the study, with all models (45/45) generating accurate estimates at 95% probability and 87% (39/45) generating accurate estimates at 68% probability; the uniform prior/default Charcoal Outlier Model approach is also 100% accurate at 95% probability, and 80% (36/45) of these models are accurate at 68% probability; the exponential prior/ no outlier approach has presented accurate estimates in 84% of models at 68% probability but is the only approach to present inaccurate End Boundary HPD intervals at 95% probability. Overall consistency of End Boundary accuracy in TS2 is inversely proportional to dataset size: 100% (45/45) of all modelled End Boundaries associated with 5 date datasets are accurate at 95%, and 89% (40/45) are also accurate at 68% probability; 98% (44/45) of all modelled End Boundaries associated with 10 date datasets are also accurate at 95%, and 84% (38/45) are also accurate at 68% probability; and 98% (44/45) of all modelled End Boundaries associated with 15 date datasets are accurate at 95%, and 78% (35/45) are also accurate at 68% probability.
95% of all models in TS2 (128/135) have generated accurate Last distributions at 95% probability, and 81% (109/135) are also accurate at 68% probability (Table 13). Consistency of Last distribution accuracy varies across the study with dataset IAτ, model specification, and dataset size although these relationships are not straightforward. The lowest IAτ datasets have produced the most consistently accurate estimates at both 95% and 68%, but there is no clear overall trend in the relationship between these variables in the rest of the study. All Last distributions generated from datasets specified with 10 years.
IAτ are accurate at both 68% and 95% probability; all models associated with the 50-year and 100-year IAτ datasets are also accurate at 95% probability, but accuracy at 68% probability decreases to 81% and 89% respectively. 81% of the 200 years IAτ groups have returned accurate estimates at 95% probability, and 75% of these distributions are also accurate at 68%. 93% of the Last distributions generated from the 500-year IAτ datasets are accurate at 95% probability, but only 59% of this group remains accurate at 68%. The uniform prior/ default Charcoal Outlier Model approach has generated the most consistently accurate Last distributions across TS2 and is the only model specification to generate accurate Last distributions for all (45/45) TS2 datasets at 95% probability. 87% (39/45) of these estimates are also accurate at 68% probability. Consistent accuracy is slightly lower for distributions generated using the exponential prior/modified Charcoal Outlier Model approach, with 96% of Last distributions accurate at 95% probability and 82% (37/45) at 68% probability; whilst the exponential prior/no Outlier approach has generated the least consistently accurate Last distributions overall, with 82% (37/45) of models at 95% probability and 73% (33/45) at 68% probability. In direct contrast with the End Boundary data, overall consistency of Last Distribution accuracy in TS2 is proportional to dataset size: with 98% (44/45) of all generated Last distributions generated from 15 date datasets are accurate at 95%, and 78% (35/45) also accurate at 68% probability; and 96% (43/45) of all modelled End Boundaries associated with 10 date datasets are accurate at 95%, and 78% (35/45) are also accurate at 68% probability; and 91% (41/45) of all modelled End Boundaries associated with 5 date datasets are accurate at 95%, and 80% (36/45) are also accurate at 68% probability.
End Boundary precision in TS2 relates strongly to dataset IAτ, dataset size and model specification (Table 14). End Boundary precision is inversely proportional to dataset IAτ, and the continuous spectra presented by all      6/9 (9/9) 7/9 (9/9) 59% 93% 8/9 (9/9) 9/9 (9/9) three dataset sizes for this parameter is notable. In the 15 date models there is a continuous spectrum in End Boundary distributions at 68% probability from 33 years (10 years IAτ) to 277 years (500 years IAτ), and at 95% probability from 53 years (10 years IAτ) to 627 years (500 years IAτ). In the 5 date models there is a continuous spectrum at 68% probability from 52 years (10 years IAτ) to 537 years (500 years IAτ), and at 95% probability from 113 years (10 years IAτ) to 1593 years (500 years IAτ). End Boundary precision in TS2 is proportional dataset size; and reducing datasets from 15 to 5 dates more than doubles the age range of End Boundary HPD intervals in all models at 95% probability (whatever the specified dataset IAτ), although the contrast between 10 and 5 date datasets accounts for much of this increase. End Boundary precision also varies according to model specification; with the exponential prior/no outlier approach almost invariably presenting the most precise End Boundary HPD intervals at both 68% and 95% probabilities (the exception here being the models generated from 10-year IAτ datasets) and the uniform prior/default Charcoal Outlier Model approach has always presented the least precise End Boundary date ranges from each dataset. The exponential prior/modified Charcoal Outlier Model approach is generally situated between these two ranges, but closer to the other more precise exponential modelling approach. Each of these effects is cumulative, such that the effects of decreased dataset size and broader model specification on End Boundary precision increases with dataset IAτ. Last distribution precision in TS2 relates strongly to dataset IAτ, dataset size and model specification (Table 14). Last distribution precision is inversely proportional to dataset IAτ (generally at least doubling between 10-year and 200-year IAτ datasets at both 68% and 95% probability), and directly proportional to dataset size (Last distributions generated from 5 date datasets are generally 1.5 times broader than 15 date datasets). Again, the difference in Last distribution precision between 10 and 5 dates is considerable and accounts for much of this overall contrast. In general, exponential prior approaches are associated with increased relative precision and Charcoal Outlier Model approaches with decreased precision. All three of these parameters have a cumulative effect on Last distribution precision, such that the 15 date 10 years IAτ datasets modelled using the exponential prior/ no outlier approach have generated Last distributions with an average precision of 50 years at 95% probability and 30 years at 68% probability; whilst the 5 date 500 years IAτ datasets modelled with using the uniform prior/default Charcoal Outlier Model approach have  (Table 15). These median values relate directly to model specification; whereby exponential prior distributions are generally associated with higher median ages and the Charcoal Outlier Model approaches with lower median ages. Last median values generally increase in age with reduced dataset size, however, and (although more complex) with increasing dataset IAτ. This is clearest in the smaller datasets.

Case Study 1 (CS1)-Castle Fincharn main block.
Documentary evidence suggests some kind of secular building was constructed in the Mid-Argyll settlement of Fincharn between 1240 and 1296 AD, or more certainly 1240-1308 AD [7,66,67]. Castle Fincharn has never been excavated, however, and architectural interpretations of the upstanding but fragmentary 2-3 storey structure surviving on the site have varied from the 13 th to the sixteenth century. An assemblage of MERLF fragments removed from this building included Quercus sp., Betula sp. and Corylus sp., consistent with regional vegetational histories, and radiocarbon analysis of five widely spaced single entity Corylus sp. samples returned determinations which calibrate to dates ranging from 1050-1270 cal AD (95% confidence; SUERC-54793) to 1220-1380 cal AD (95% confidence; SUERC-54796) ( Table 16; Fig. 5).
This 5 date MERLF radiocarbon dataset is statistically consistent at 5% significance level (T′ = 3.5, T′(5%) = 9.5, ν = 4); generating a Combine distribution of 1220-1275 cal AD (95% probability; Fincharn Castle; Additional file 3: Sect. 3.2) and an age range Difference of − 40 to 220 years (95% probability; Fincharn Range; Additional file 3: Sect. 3.1) ( Table 16). The Last and End Boundary distributions generated from the dataset range vary between 1230-1285 cal AD (95% probability) probably  6). There is little variation in lower limits of all these End Boundary and Last ranges but decreased upper limit ages in End Boundary HPD intervals generated by models which include Charcoal Outlier Models at 95% probability reduces precision and consistency with historical evidence. All Last ranges are more precise than the latest calibrated date (SUERC-54796) and, with an estimate-TPQ/TAQ probability sum of 195%, the Last distribution generated by the exponential prior/modified Charcoal Outlier Model approach is the most consistent with currently available historical evidence (Table 17; Fig. 6).

Case Study 2 (CS2)-Aros Castle north-west block
Documentary evidence suggests some kind of castle building was constructed on the Dun Aros site before 1385, and current art-historical typologies suggest the bar traceried arcuate windows in the upstanding 2-3 storey north-west block first emerged in Scotland at Elgin Cathedral after 1270 AD [68]. The site has never been excavated and the relationship between the north-west-block and the adjacent enclosure is currently unknown, but architectural comparanda also suggests a late 13th-14th date for this former building is likely. An assemblage of MERLF samples removed from this structure during a wider programme of buildings and materials analysis included fragments of Betula sp., Corylus sp., Fraxinus sp. and Quercus sp. These taxa are consistent with regional vegetational histories, and radiocarbon analysis of five widely spaced single entity Betula and Corylus samples returned determinations which calibrate to dates      Fig. 8).
All estimates generated from this dataset are consistent with the art-historical, architectural, and documentary evidence relating to the building and site, although the Combine distribution is very early. Variation in the lower limit of all other generated distributions is limited to five years at 95% probability (1295-1300 cal AD), and hence all are 100% after TPQ. Decreases in the upper limit age (and median) of these Last and End Boundary ranges decreases precision and consistency with historical evidence. With an estimate-TPQ/TAQ probability sum of 193%, the Last distribution generated using the exponential prior/no Outlier approach is most consistent with the available art-historical and historical evidence, presenting a date range very similar to latest dataset date (SUERC-62566; 1290-1410 cal AD at 95% probability) ( Table 19; Fig. 8).

Case Study 3 (CS3)-Castle Roy enclosure and tower
The Speyside lordship of Abernethy emerges into the surviving documentary record in 1226 AD and the castle enclosure currently surviving on the site of the Castle of Abernethy (now known as Castle Roy) displays an arcuate entrance which is unlikely to have been constructed before 1150 AD [69]. Excavation suggests this substantially upstanding masonry structure is the earliest building on the site, and an extensive assemblage of in-situ MERLF fragments removed from the upstanding castle enclosure included Quercus sp., Betula sp. and Pinus sp. This is broadly consistent with the vegetational history of the region and five widely distributed single entity fragments of Betula and Pinus returned radiocarbon determinations which calibrate to dates ranging between 990 and1160 cal AD (95% confidence; SUERC-75745) and 1040-1260 cal AD (95% confidence; SUERC-75746) ( Table 20; Fig. 9).
The extreme upper end of the Combine distribution is consistent with the architectural evidence but is very early. Variation in the lower limits of the generated End Boundary and Last distributions is limited to five years between 1050 and 1055 cal AD at 95% probability, and all End Boundary and Last distributions are consistent with the available architectural and historical evidence relating to the building and site.  With an estimate-TPQ/TAQ probability sum of 173%, the Last distribution generated using the exponential prior/no outlier approach is most consistent with this evidence, presenting a date range very similar to latest date (SUERC-75746) at 68% probability, but considerably more precise at 95% probability (Table 21; Fig. 10).

Case Study 4 (CS4)-Lochindorb Castle enclosure
A very narrow 1258-1279 AD constructional date has been widely accepted for initial construction of the upland Moray castle of Lochindorb on the basis that the building was constructed by John Comyn before a 1279 reference to 'Robert of Lochindorb' [7,   70]. A limited assemblage of in-situ MERLF samples removed from the earliest upstanding phase of the enclosure was completely dominated by fragments of Quercus sp., although this genus is not consistent with relict semi-natural woodland populations of Pinus-Betula surviving locally. Five widely spaced single entity Quercus sp. samples with no terminal ring evidence were selected from this phase for radiocarbon analysis, and these returned a wide distribution of calibrated dates ranging between 550 and 380 cal BC (95% confidence; SUERC-75752) and 1160-1270 cal AD (95% confidence; SUERC-75747) (Table 22; Fig. 11). This 5 date dataset is not statistically consistent at 5% significance (T′ = 2014, T′(5%) = 9.5, ν = 4), generating an age range of 1555 to 1925 years (95% probability; Lochindorb Range; Additional file 3: Sect. 3.19) and failing to generate a Combine distribution (Additional file 3: Sect. 3.20) (Table 22). The Last and End Boundary distributions generated from this dataset range between 1175-1270 cal AD (95% probability) probably 1195-1260 cal AD (68% probability; Lochindorb Lowest IA MERLF 1; Additional file 3: Sect. 3.21), and 1200-3010 cal AD (95% probability) probably 1235-1890 cal AD (68% probability; Construction Lochindorb Castle 4; Additional file 3: Sect. 3.24) (Table 23). This includes an exponential prior/modified Charcoal Outlier Model specified with a timeconstant of 300 years (Additional file 3: Sect. 3.22) consistent with the Quercus sp. dominated character of the MERLF assemblage (Table 1).
All generated Last and End Boundary estimates are consistent with the available historical evidence. With an estimate-TPQ/TAQ probability sum of 114%, the Last distribution generated using the exponential prior/modified Charcoal Outlier Model approach is the most consistent with this other evidence, and this distribution is later and somewhat broader than the latest dataset date (SUERC-75747) ( Table 23; Fig. 12).

Case Study 5 (CS5)-Achanduin Castle enclosure and hall
Surviving charter evidence suggests the upstanding castle at Achanduin on Lismore was constructed between 1240 and 1310 AD, whilst a Balliol coin recovered during excavation beneath the castle courtyard has been highlighted to suggest this constructional period may be constrained to a very narrow 1292-1310 AD period [49,71]. A very limited assemblage of in-situ MERLF fragments removed from the upstanding essentially single-phase building was comprised of Quercus sp. and Betula sp., consistent with regional vegetational histories, and radiocarbon analysis of one Quercus and two Betula fragments returned determinations which calibrate to between 1180-1290 cal AD (SUERC-62547) and 1260-1390 cal AD (SUERC-62546) at 95% confidence (Table 24; Fig. 13).
This 3 date dataset is statistically consistent at the 5% significance level (T′ = 4.0, T′(5%) = 6.0, ν = 2), generating a Combine distribution of 1265-1295 cal AD (95% probability; Achanduin Castle; Additional file 3: Sect. 3.26) and an age range of 0 to 155 years (95% probability; Achanduin Range; Additional file 3: Sect. 3.25) ( Table 24). The Last and End Boundary distributions generated from the dataset range between 1270-1385 cal AD (95% probability) probably 1275-1305 cal AD (68% probability; Achanduin Lowest IA MERLF 1; Additional file 3: Sect. 3.27), and 1275-1820 cal AD (95% probability) probably 1280-1450 cal AD (68% probability; Achanduin Castle Construction Completed 4; Additional file 3: Sect. 3.30) (Table 25), and this includes an exponential prior/modified Charcoal Outlier Model specified   Table 1). The generated Combine distribution is not consistent with the archaeological evidence at 68% probability. All Last and End Boundary distributions generated are consistent with the available archaeological and historical evidence, with lower limits varying from 1265 to 1275 cal AD and precision and median age decreasing with uniform prior and Charcoal Outlier Model specifications (Table 25). With an estimate-TPQ/TAQ probability sum of 141%, the Last distribution generated using the exponential prior/modified Charcoal Outlier Model approach is the most consistent with this other evidence, and this distribution is similar to the latest dataset date (SUERC-62546) at  95% probability but much more precise at 68% probability (Table 25; Fig. 14).

Case Study 6 (CS6)-Lismore Cathedral nave
The earliest surviving contemporary reference to a church building which can be reasonably related to the site of Lismore cathedral dates to 1314 AD, although other historical evidence suggests the diocese was formally erected between 1192 and 1214 AD [72][73][74]. An upstanding medieval church chancel on the site has been ascribed to a range of 13 th to fourteenth century dates and highlighted to illustrate the challenges faced by architectural historians in ascribing more precise dates to western Scottish masonry buildings of this period [75]. An assemblage of MERLF samples removed during excavation of the more fragmentary nave and western tower included fragments of Alnus sp., Betula sp., Corylus sp. and Quercus sp. consistent with local vegetational histories, and 3 samples of Corylus and Quercus from the earlier nave returned a range of radiocarbon determinations calibrating to between 1030-1219 cal AD (95% confidence; SUERC-75732) and 1290-1400 (95% confidence; SUERC-75727) (Table 26; Fig. 15). The Corylus MERLF sample (SUERC-75727) within this small assemblage retained some probable terminal ring evidence. This 3 date dataset is not statistically consistent at 5% significance (T′ = 48.9, T′(5%) = 6.0, ν = 2) but has generated a Combine distribution with poor agreement of 1250-1275 cal AD (95% probability; Lismore Cathedral Nave; Additional file 3: Sect. 3.32), and an age range of 125 to 345 years (95% probability; Lismore Range; 4.31) ( Table 26) Table 1).
The late 13 th -century Combine date is consistent with available historical evidence. All Last and End Boundary distributions are later than the historical TPQ (100% probability), but the extent to which these distributions pre-date the documentary TAQ varies between 24% (Lismore Nave Lowest IA MERLF 1; Additional file 3: Sect. 3.33) and 0% (Lismore Cathedral Nave Complete 4; Additional file 3: Sect. 3.36). With an estimate-TPQ/TAQ probability sum of 124%, the Last distribution generated using the exponential prior/no outlier approach is the most consistent with this other evidence, and is very similar to the latest dataset date (SUERC-75727) ( Table 27; Fig. 16).

The theoretical studies
Variation in the datasets generated from the same model parameters during these theoretical studies highlights that radiocarbon date simulation is a random probabilistic process, and multiple datasets are therefore required to examine how this variability affects the estimates generated using different modelling approaches. Sixty-five exponentially distributed datasets of between five and twenty simulated dates were randomly generated from a true event of 1250 AD for the two main theoretical studies considered in this paper-TS1 and TS2.  None of these TS1 or TS2 datasets contain dates which are later than the true event at 95% probability, and the number of dates in a single dataset which contain the true event at 95% probability varies from twenty to zero (Tables 2 and 11). Almost all datasets contain at least one date which includes the true event, and this includes all 15 date and 20 date datasets and all datasets with a specified IAτ of 100 years or less. Increasing dataset IAτ has increased the age of the latest date in both studies and thereby resulted in datasets with a lower fraction of dates which include the true event (Table 28). There is no convincing relationship between fraction of accurate dates and dataset size in these studies, although a drop off is apparent between 10 and 5 date datasets (Table 28) and some small very high IAτ datasets are completely dominated by inaccurately early dates (Table 11).
Dataset age ranges in these studies are proportional to IAτ and size (Table 29). The only 20 date dataset to   (Table 3). 251 (98%) of the 255 models in TS1 and TS2 have generated accurate End Boundary HPD intervals at 95% probability, and 215 (84%) of these are also accurate at 68% probability (Table 30). Accurate End Boundary estimates have been generated with almost identical consistency in TS1 and TS2, and by the three main models included in both studies (Table 31). End Boundary and Last distributions generated from datasets with very low IA lifespans (10 years IAτ) are more consistently accurate in both studies, but no general relationship between accuracy and dataset IA or dataset size was noted elsewhere (Tables 13, 31 and 32). The most consistently accurate End Boundary estimates in both theoretical studies were generated by models specified with both exponential priors and Charcoal Outlier Models (Table 32). This was the only model specification to generate accurate End Boundary HPD intervals at 95% probability from all datasets in both studies, and in both studies 87-90% of these estimates were also accurate at 68% probability. No change in End Boundary accuracy resulted from modifying the Charcoal Outlier Model time-constant (to match that of the specified dataset lifespan) in 20 date models with exponential priors, but an increase in accuracy is evident in the less precise uniform prior approaches ( Table 5).
All Last distributions are slightly earlier than the End Boundaries generated from the same datasets in both theoretical studies. These contrasts are more marked in 95% probability distributions and increase with increasing dataset IAτ, decreasing dataset size, and uniform prior and Charcoal Outlier Model specifications. The Last distributions generated in the 5 date to 15 date datasets of TS2 are slightly less consistently accurate than the corresponding End Boundaries (Table 30) and, in further contrast, there is some minor evidence that Last accuracy is proportional to dataset size. Overall, the most consistently accurate Last distributions in TS2 were generated by the uniform prior/default Charcoal Outlier Model approach; and this was the only model specification to generate accurate Last distributions at 95% probability from all TS2 datasets, with 87% of these also accurate at 68% probability (Table 13). That these accuracy percentages are identical to those reported for exponential prior/Charcoal Outlier Model End Boundary distributions is salient and will be returned to below.
Relative precision in End Boundary and Last distributions across both theoretical studies consistently decreases with increasing dataset IAτ, decreasing dataset size, and the imposition of a uniform prior distribution or Charcoal Outlier Model, and each of these factors has a cumulative effect. This is illustrated by the average precision of Last and End Boundary distributions generated using the exponential Prior/modified Charcoal Outlier Model approach in TS1 and TS2 where increasing dataset IAτ above 100 years or reducing dataset size below 10 dates has considerable impact, even though these datasets     were generated independently (Tables 33 and 34).
Last distributions are generally more precise than End Boundary estimates at both 68% and 95% probability, and thereby variations in dataset IAτ, dataset size, and model specification have a reduced impact. Conversely, given that reducing the Charcoal Outlier Model time-constant appears to have increased precision in broader 20 date TS1 End Boundary distributions subject to uniform priors and high dataset IAτ, it is reasonable to expect that time-constant variation would also have a greater effect where dataset size is very reduced and generated estimates broad (see below). The older latest dates associated with smaller and higher IAτ datasets in TS1 and TS2 (Tables 2 and 11) generally result in End Boundaries with older lower limits, and the implications of this relationship for Fig. 16 Probability distributions from a Lismore Cathedral nave 'standalone' model, generated using the exponential prior/no outlier approach. All three radiocarbon determinations have been situated within a single exponentially distributed phase  and thereby the End Boundary and Last distributions generated by all models at 68% probability and the Last distributions generated by both exponential prior approaches at 95% probability are also too early (Table 12). No latest dataset dates are inaccurately late   at 95% probability in these theoretical studies, but in TS1 dataset M1e run 3 the latest date (M5e) is too late at 68% confidence (1260-1300 cal AD), and the End Boundary distributions generated by all models from this dataset are also too late at 68% probability (Table 4). Indeed, the End Boundaries generated by both uniform prior/Charcoal Outlier Models from this dataset are also too late at 95% probability, and these are the only two inaccurate End Boundary HPD intervals at 95% probability in TS1. The data associated with these examples also illustrate how model specification can mitigate against latest date variation, since the earlier distributions generated by models with exponential priors are generally more accurate where the latest dataset date is relatively late, whilst the older estimates generated by models incorporating the Charcoal Outlier Model are generally more accurate where the latest dataset date is relatively early. In TS2 dataset MRRR1c run 3, for example, the latest simulated date contains the true event at 95% confidence (1220-1380; MRRR1c) but is too late at 68% confidence (1260-1300 cal AD; MRRR1c; Additional file 2: Sect. 2.9.10); and in this instance both Charcoal Outlier modelling approaches have generated inaccurately late End Boundaries at 68% probability, and only the exponential/no outlier approach has generated an accurate End Boundary at 68% probability. In contrast, the only two inaccurate End Boundaries at 95% probability in TS2 are too early (MR1c run 3 and MRR1d run 1) and, predictably therefore, these distributions are associated with comparatively early dataset latest dates and exponential prior/no outlier modelling approaches.
These processes also have implications for distribution selection; since the earlier Last distributions are often more accurate than End Boundaries where a comparatively late latest date pertains (e. g. TS2, MR1b run 2, exponential prior/no outlier) while End Boundaries are more accurate where an early latest date pertains. Indeed, given the preponderance of relatively early dates, this might explain the comparatively greater consistency of End Boundary distributions overall. The estimates generated from the TS2 dataset MRR1d run 1 illustrate how these processes affect both distribution and model selection, since the latest date (MRR3d) contains the true date at 95% confidence (1040-1270 cal AD) but is too early at 68% confidence (1050-1230 cal AD) (Table 11), and thereby: the End Boundary generated by the exponential prior/no outlier modelling approach is too early at 95% probability; the End Boundaries generated by both exponential prior models are too early at 68% probability; the Last distributions generated by both exponential prior models are also too early at 95% probability; and the Last distributions generated by all three modelling approaches are too early at 68% probability (Table 12). The uniform prior/Charcoal Outlier Model approach, however, has generated accurate End Boundary estimates at both 95% and 68% probability from this high IAτ dataset, as well as an accurate Last distribution at 95% probability.
Ultimately, the Last and End Boundary distributions generated by different model specifications form a continuous chronological spectrum: from the very early and precise Last distributions generated from large low IAτ datasets by the exponential prior/ no outlier modelling approach; to the later and more imprecise End Boundary distributions generated from small high IAτ datasets by the uniform prior/Charcoal Outlier Model approach. The evidence presented in TS2 also suggests this spectrum correlates with the accuracy of estimates generated from datasets subject to different IAτ/size characteristics: wherein the exponential prior/no outlier approach has generated the most consistently accurate Last distributions from datasets with an IAτ which is lower than 100 years (where latest dates are likely to be relatively late), and  the uniform prior/Charcoal Outlier approach has generated the most consistently accurate Last distributions from datasets of 100 years IAτ and above (where latest dates are likely to be relatively early) (Table 13). Unsurprisingly, given their precision, the proximity of the Last distribution median values to the true event date follows this same pattern (Table 15), while the accuracy threshold between these contrasting approaches is slightly broader in the End Boundary distribution evidence. Both exponential prior approaches present the most consistently accurate End Boundary distributions from datasets with an IAτ lower than 100 years in TS1 and TS2, and approaches which include a Charcoal Outlier Model are more consistently accurate from 200 years IAτ and above (although this breaks down at 500 years IAτ in TS1) ( Table 31). That a considerable overlap between these theoretical Last and End Boundary spectra also pertains is clearly illustrated in the TS2 results, wherein the most consistently accurate End Boundaries have been generated by the exponential prior/Charcoal Outlier Model approach, whilst the most consistently accurate Last distributions have been generated by the uniform prior/Charcoal Outlier Model approach. Indeed, both approaches have generated accurate estimates from all datasets at 95% probability and 87% of all models at 68% probability, while closely comparable average precision and median values between these different distributions confirms they overlap considerably (Tables 13, 15 and 17). These results are consistent with those presented by previous authors. If the 500yrs IAτ datasets are disregarded, then the uniform prior/Charcoal Outlier Model approach does indeed generate more consistently accurate End Boundary distributions than the exponential prior/no outlier approach [17], with this latter approach once again returning some precise but inaccurate End estimates at 95% probability (Table 16). These inaccuracies are limited to datasets above 50 years IAτ, however, and the End Boundary estimates generated by the exponential prior/Charcoal Outlier Model approach are more consistently accurate overall. In the above studies this is even evident where dataset IAτ range is much reduced, and most particularly so with lower IAτ datasets where the less accurate and less precise uniform prior approach has been specified. These data, therefore, also support previous MERLF analysis protocols which had promoted a binary short-lived (exponential prior) and long-lived (Charcoal Outlier Model) approach [7]; but allows an increased role for exponential prior model specifications and further understanding of how these relate to Combine and Charcoal Outlier Model approaches. The accuracy of the Combine distributions generated from 10 years IAτ datasets is also resonant of Waterbolk's (1971) Group A samples which, he suggested, would extend up to 20 years [10]. Ultimately, these theoretical results provide a less binary Bayesian framework which can inform our interpretations of the datasets returned by MERLF materials from the six Scottish medieval case study buildings.

The case studies
The compositions presented by these case study MERLF assemblages confirm that a range of different locally available tree taxa were exploited for limekiln fuel in Scotland during this period, and the maximum age ranges presented by the resulting radiocarbon datasets are generally consistent with the taxa-specific and habitat-contingent IAτ of those woodland  It is salient that the datasets associated with both of these sites are high IA and statistically inconsistent at 5% probability, however, and therefore the End Boundary distributions generated from these data are relatively broad. Recognising how we might retain accuracy while maximising precision through sample selection and model specification requires further comparison with the theoretical and ecological data. Variation in dataset age ranges and levels of statistical consistency across these studies suggests that the assemblages did include materials subject to some IA, and only two of these case study datasets have presented good Combine agreement indices (even including the reduced Lochindorb Castle CS4* dataset) (Table 35). That Combine distributions could be generated at all suggests the IA associated with five of these assemblages is reasonably limited, however, and comparison with the theoretical data from TS1 and TS2 suggests four of the six case studies (or five if the reduced Lochindorb Castle is included) present dataset age ranges consistent with mean lifespans associated with less than 50 years IAτ. Importantly, this situates these four-five studies below the considerable reduction in estimate precision associated with theoretical datasets subject to IAτs of over 100 years in TS1 and TS2 (Tables 32  and 34), and all (100%) End Boundary and Last distributions generated from such narrowly distributed 5 date datasets were consistently accurate in TS2 at both 95% and 68% probability ( Table 13). The estimate/TPQ-TAQ sum percentages also illustrate that interdisciplinary consistency generally increases with precision, and the most consistent estimates across all six case studies were Last distributions generated by exponential prior modelling specifications. Indeed, in most cases these Last distributions are very similar to each available latest dataset date, and constructional dates close to the lower limits of these distributions are often most convincing.
The association of higher IAτ and small datasets with earlier latest dates and decreased fractions of accurate dates in TS1 and TS2 (Table 28) suggests that it would be prudent to include a Charcoal Outlier Model within model specifications where radiocarbon datasets are less narrowly distributed, and yet the botany suggests different approaches to the case study evidence are probably required. Both Quercus sp. dominated case study assemblages (Lochindorb Castle enclosure and Lismore Cathedral nave) are statistically inconsistent at 5% significance, but the Lochindorb dataset contains an extremely early determination, high age range, and low fraction of consistent dates. This assemblage is completely dominated by Quercus sp. samples, and although manual exclusion of the early determination has decreased dataset IAτ considerably, this still contains only two dates which are consistent with historical evidence at 95% probability and the latest date does not extend beyond the documentary TAQ at either 68% or 95% probability (Appendix 1: Table 37). Notably, this reduced dataset is also the only example to generate End Boundary estimates which are more consistent with the historical evidence than the earlier and more precise Last distributions, and the End Boundary generated by the exponential prior/modified Charcoal Outlier Model approach to the dataset is the most consistent overall (Appendix 1: Table 38). Archaeobotanical and statistical evidence suggests this approach is less relevant to the Lismore Cathedral nave study since, although the high age range associated with this small dataset is largely predicated on residuality in two early Quercus sp. fragments, the latest date is associated with a fragment of Corylus. Since the uniform prior/Charcoal Outlier Model approach to this dataset has generated the only inconsistent estimate in the data from all six studies (Table 27), it is entirely possible that the calibrated radiocarbon date associated with this latter fragment is relatively late. This latest determination allows the 200 years or less IAτ of the wider Quercusdominated radiocarbon dataset and its relationship to the constructional event to be interpreted with greater confidence. Notably, this latest radiocarbon determination also calibrates to a date almost identical to the Last distribution generated from the wider dataset using the (most precise) exponential prior/no Outlier approach, and this currently represents the most convincing estimate for the construction of this fabric. The correlation between dataset IAτ and latest date does not hold for the radiocarbon evidence returned by this mixed-taxa assemblage, and the application of a Charcoal Outlier Model approach to this very small dataset is inappropriate.

Conclusion
This paper has presented further evidence that a range of different tree taxa were exploited for limekiln fuel in Scotland during the medieval period, and the range of MERLF taxa surviving from this process are generally consistent with regional phytogeographic distributions. The samples selected from these assemblages for radiocarbon analysis have returned datasets characterised by an array of different age range distributions and most of these are associated with some level of IA. At this stage in the research cycle, however, the IAτ of these materials does appear to be constrained by the taxa-specific and habitat-contingent lifespans and post-mortem durabilities of the parent wood fuels. Given that shorter lifespan taxa were generally selected for radiocarbon analysis where possible, it seems probable that the carbon distributions in these assemblages are generally equivalent to the available woodland source. MERLF assemblages can therefore be considered an excellent source of palaeoenvironmental information, with a research potential again underscored by the stratigraphically secure mortar material within which these materials have been entrapped.
The MERLF assemblages considered in this paper are dominated by wood-charcoal fragments without surviving terminal ring evidence, and the range of radiocarbon determinations returned by selected samples suggests that calibrated dates from single determinations, unweighted mean averages of multiple determinations and/or bulk samples, cannot be accepted as direct evidence for the construction of masonry buildings without other forms of evidence. The TPQ role performed by such determinations can be of considerable value for multidisciplinary interpretation, particularly where the radiocarbon evidence is relatively late and documentary evidence is convincing and early. Increasing the potential for these buildings and materials to inform interdisciplinary (rather than multidisciplinary) discourse, however, requires accurate standalone constructional estimates of greater precision.
In the absence of non-residual intrusive materials, the generation of accurate Last and End Boundary distributions from MERLF radiocarbon datasets subject to significant IA relies more on the accuracy of the latest available determination, than on dataset IAτ, dataset size, or model specification. Last and End Boundary precision, however, is very closely related to all three of these parameters; decreasing with increasing dataset IAτ, decreasing dataset size, and model specifications which include uniform priors or Charcoal Outlier Models. These factors are interrelated and can be cumulative, and the Last and End Boundary distributions generated by different model specifications thereby present continuous and overlapping spectra. These range from the early and precise Last distributions generated from large low-IAτ datasets by models specified with an exponential prior/ no outlier approach; to the later and less precise End Boundary distributions generated from small high-IAτ datasets by models specified with uniform priors and the default Charcoal Outlier Model.
Different Bayesian model specifications can therefore be imposed on MERLF radiocarbon data to maximise precision whilst retaining accuracy. The data relating to the 13th-century events presented in this paper suggests that: (i) Where the IAτ of a dataset is limited in 10 years or less, then determinations are likely to be statistical consistent at 5% significance and a Combine average distribution is likely to represent an accurate and very precise constructional estimate; (ii) Where the IAτ of a dataset is limited to 50 years or less, then determinations are unlikely to be statistical consistent at 5% significance and Combine agreement indices will be poor, but the Last distribution generated by a model specified with exponential priors is likely to represent an accurate and reasonably precise constructional estimate; and (iii) Where the IAτ of a dataset is greater than 100 years, then a Last distribution generated by a model with a Charcoal Outlier Model is likely to generate an accurate but imprecise constructional estimate, while modification of the outlier time-constant is likely to increase precision where dataset size is limited.
The studies considered in this paper provide further evidence that Bayesian techniques can generate consistently accurate constructional estimates for medieval masonry buildings from MERLF radiocarbon data, whatever the ecological provenance of the limekiln fuel source. Estimate precision is contingent upon source ecology but can be increased by a more informed approach to materials analysis and interpretation. The radiocarbon evidence considered here and elsewhere [7,76] is biased by the selection of single entity MERLF fragments from shorter lifespan tree taxa, where possible, and most of these have returned determinations consistent with (i) and (ii) above. It seems likely this has enabled the generation of more precise constructional estimates, although in many cases precision might be further increased by expanding these radiocarbon datasets to include higher precision (reduced error margin) analysis of short lifespan MERLF fragments.   74.5 0 The data reconsidered in this paper was generated during PhD research undertaken at the University of Edinburgh and post-doctoral research at the University of Stirling. Within these projects, charcoal identification was undertaken with Dr Mike Cressey (then at CFA Archaeology) and radiocarbon analysis was funded by Historic Environment Scotland. The author is grateful for the comments provided by two anonymous peers and the journal editor, which improved this paper significantly, and for their patience when the original manuscript was withdrawn in 2020 following the emergence of the new calibration curve.

Authors' contributions
This paper is the work of MT, who is sole author.

Funding
Not applicable.

Availability of data and materials
All the data relating to this paper are included in the main text and additional file.