Data and experimental Setup
The workflow to create a neural network with an appropriate training dataset and to produce labeled pigment maps of paintings is outlined in Fig. 1 and consists of four steps: 1. collect a sufficiently large spectral training dataset in which the pigments for each spectra are labeled; 2. create a neural network to predict pigments present in the input RIS spectra; 3. validate the accuracy of the network (predictions of pigments present) with a hold-out sample (10% of the training data); and 4. test the network prediction of pigments present on two well-characterized paintings that were not part of the training dataset.
In order to build a reasonable pigment labeled reflectance spectral training dataset for a given artistic school, paintings from which training data are selected must meet several constraints. They must be painted using a similar suite of materials, and generally with similar painting methods with respect to ground application (or absence thereof), degree of layering, degree of pigment mixing, etc. as described above. They need not, necessarily, be painted by the same artist, so long as these general criteria are met. Having reflectance data from the work of several artists who paint using similar methods may make the training data more robust. Manuscript illuminations (the painted images found within early books) have been widely analyzed by RIS [32,33,34] and provide an ideal test case for the approach used here. We have therefore selected paintings from a single book likely executed by a small number of artists, all with access to similar pigments, and following similar painting techniques with respect to pigment mixtures and glazes (that is, operating in the same general school of artistic practice).
Additionally, the set of pigments used in manuscript illumination is relatively limited, and well-studied, making it possible to confidently identify examples of the most commonly encountered pigments, pigment mixtures, and painting techniques [35,36,37,38,39]. For example, purple pigments can be derived from natural materials such as mollusks, lichens or dye plants, or by using mixtures of blue pigments (e.g. azurite, ultramarine, indigo) with red lake pigments (such as carminic acid or brazilwood) to create purple hues. Similarly, blue pigments were often mixed with yellow pigments (lead tin yellow or yellow dyes precipitated onto substrates) to expand the range of copper-based green materials available to an illuminator. The possible combinations of materials could create variation even within a single object in the painting. To model the three-dimensional form of a blue azurite robe, for example, lead white could be mixed in larger amounts to achieve highlights on the robe, or a transparent red lake could be layered on top of the blue to define purplish shadows. Both mixing and layering can contribute to the non-linear mixing effects evident in reflectance spectra from such areas.
The reflectance training dataset created for the 1D-CNN consisted of spectra collected from four well-characterized paintings from an illuminated manuscript containing many of these commonly encountered materials and mixtures. The manuscript chosen for this work was the Laudario of Sant’Agnese (c. 1340), one of only three surviving illuminated books of this type (a laudario is a collection of hymns of praise), and which has individual illuminations (described as paintings throughout this paper for clarity) by at least two artists, which are now dispersed in several collections around the world [40,41,42]. The paintings used to build the training set (Additional file 1: Figure S1) include:
-
1
The Martyrdom of Saint Lawrence, Pacino di Bonaguida, about 1340, Tempera and gold leaf on parchment. Getty Museum, Los Angeles, Ms. 80b (2006.13), verso
-
2
The Ascension of Christ, Pacino di Bonaguida, about 1340, Tempera and gold leaf on parchment. Getty Museum, Los Angeles, Ms. 80a (2005.26), verso
-
3
The Nativity with the Annunciation to the Shepherds, Master of the Dominican Effigies, c. 1340, miniature on vellum, National Gallery of Art, Washington, D.C., Rosenwald Collection, 1949.5.87
-
4
Christ and the Virgin Enthroned with Forty Saints, Master of the Dominican Effigies, c.1340, miniature on vellum, National Gallery of Art, Washington, D.C., Rosenwald Collection, 1959.16.2
These paintings from the Laudario have been studied in great detail to determine the pigments and paint mixtures used as well as the artists’ working methods. The illuminations in the collection of the J. Paul Getty Museum were extensively studied for the 2012-2013 exhibition Florence at the Dawn of the Renaissance: Painting and Illumination 1300–1350 using point-based analysis techniques (XRF, Raman spectroscopy and microscopic examination), broadband infrared imaging (900–1700 nm) and ultraviolet light induced visible fluorescence photography [34]. More recently these folios have been re-examined by RIS, XRF mapping (also referred to as scanning macro-XRF spectroscopy or MA-XRF), as well as point-based fiber optic reflectance spectroscopy (350–2500 nm) for this work. The point analysis data was combined with the RIS data and XRF maps to define regions of the data cubes where similar pigments are present. The results of all of these studies have been summarized in the Additional file 1: Table S2. The two works in the collection of the National Gallery of Art have also been previously studied for the Colour Manuscripts in the Making: Art and Science conference (2016, University of Cambridge) and the RIS image cubes have been classified and labeled with the pigments determined to be present either from the RIS spectra and/or from the results of site-specific XRF and fiber optic reflectance spectroscopy (350–2500 nm) [25, 33].
In constructing the training spectral dataset, regions in the RIS cubes having the same spectral shape and known pigment composition were selected both within a given painting as well as among all four paintings. The labels of the training dataset represent the pigment(s) whose spectral signature(s) dominate(s) the spectra (i.e., with the effects of the substrate and presence of ad-mixed white pigments included). Thus, an area containing mostly azurite will be described as belonging to the pigment category “azurite” (even if there is a small quantity of, for example, a white, black, or other-colored pigment), while an area containing a fairly equal mixture of azurite and lead white might be described as “azurite/white” when the amount of white present begins to noticeably alter the spectrum. As a result, the training dataset incorporates the effects of variations in paint layer thicknesses and mixtures that incorporate white pigments (lead white, chalk, etc). The only paint mixture excluded in the training dataset is that of the flesh. The omission of the flesh tones was done purposely as they represent a small area of the paintings and their composition is known to differ among the artists who painted each painting used for the training [34].
Figure 2a displays a representative image indicating the locations from which reflectance spectra were extracted from one of the paintings, The Nativity with the Annunciation to the Shepherds. Selected areas were not averaged; each spectrum was treated as an individual feature. In total, 25 classes (paints) were identified. These classes consisted of both pure pigments (where “pure” is used to describe paints where spectra are dominated by one pigment) or “mixed” pigments (where there are two pigments contributing to the spectral signature). The mean spectra of all classes can be seen in the Additional file 1: Figure S2, and represent the diversity of pigment and pigment mixtures observed in these paintings. A total of more than 300,000 individual spectra were collected across all four paintings.
Since not all pigments or mixtures are as abundantly used as others, there were several classes where a limited number of samples was collected (e.g. 40 green earth vs. 61092 azurite samples per class). For the model to formulate general rules and not over-train on the larger classes, the training data were reduced to 16,683 spectra with the number of samples per class more evenly distributed. This was accomplished by iteratively removing similar spectra (based on Euclidean distance as a measure of similarity) in order to conserve the variability in the training spectra. Even though other distance measures (e.g. Mahalonobis, Hausdorff, spectral angle, etc.) could have been used, a simple Euclidean distance is a common metric for assessing spectral similarity in hyperspectral imagery and is used here. Visualization of the reduced number of spectra vs. all the spectra for a given class showed sufficient variability to justify this approach. However, for other paintings or sets of pigments these other distance measures could be considered to improve separability and will be addressed in future work. Thus, for each class with more than 1000 spectra, a spectrum was selected at random, and the 100 most similar spectra to the chosen spectrum were removed from the class. This was repeated until each large class was reduced significantly. Class sizes, model accuracy and labels can be seen in the Additional file 1: Table S1. Note the lower accuracy of “Ochre yellow”. This is probably due to the similarity in spectra between yellow and orange ochre (see average spectra of both in the top left plot in the Additional file 1: Figure S2). Figure 2b displays the reduced number of spectra of brown ochre; the dotted line shows the average of all plotted spectra. The spectral variability within this pigment can clearly be seen in the plot. The one distinct outlier visible, with higher reflectance from 400 to 550 nm, and was probably mis-labeled in the original collected training spectra. Cases similar to this one, where one or more spectra in the training data may be incorrectly identified as belonging to a given pigment category, is due to the method used to extract spectra for the training data, wherein spectra from related areas were defined with the same pigment category label. The mean spectrum of each pigment category is plotted in Additional file 1: Figure S2, and correspond well to the expected reflectance curve of the pigment(s) named in the category label.
Performance evaluation of the 1D-CNN model
The degree of success of the 1D-CNN model was evaluated in two ways. The first method was a quantitative model performance evaluation and examines the robustness of the neural network itself. The second provided insight as to how well the 1D-CNN model produces accurate labeled pigment maps. This is done by comparing the resulting maps with those generated using the more traditional method (i.e classification of the same RIS cube using ENVI-SHW followed by labeling the classes in terms of pigments either from RIS spectral features or fusing the class maps with other data), described in this paper as truth maps.
Quantitative model performance evaluation
The first method, to validate the performance of the neural network on the training set created using the four paintings, applied 10-fold cross-validation to estimate model performance, with results averaged. The k-fold cross-validation is a method used to evaluate machine learning models, where the training data is split into k groups. The 1D-CNN is then trained on k-1 groups and tested on the hold-out group. This is repeated for all k groups and the results averaged to produce a less biased estimate of the model’s performance [43]. To calculate the results of each of the k models, mean-per-class-accuracy was used. This method, used when training data have unbalanced sets (classes with different amounts of training data), reports the average of the errors in each class, thus giving similar weight to each class and preventing larger classes from dominating results. Thus the mean per class accuracy for each of the 10 models created using cross-validation was averaged to calculate the final model performance.
The overall mean per-class accuracy (averaged across the 10-fold cross validation results) for the 1D-CNN was 98.7%. Results for each pigment or mixture class can be seen in the Additional file 1: Table S1. Model performance based on this metric shows very good results for all classes.
Comparison of 1D-CNN pigment labeled maps versus truth maps
After training, the 1D-CNN model was applied first to the Pentecost, Fig. 3a, another painting from the Laudario of Sant’Agnese, the same illuminated book from which the paintings used to create the training dataset were obtained. The output of the 1D-CNN consists of 25 maps, one for each of the pigment classes in the training dataset. The intensity at each pixel in a given map is the probability of a match between the RIS spectra at that spatial pixel and the pigment class as determined by the 1D-CNN model. Each of the labeled pigment maps were thresholded to 0.99 or greater probability to construct the composite pigment labeled map in Fig. 3d. This reduced the number of pigment-labeled classes from the possible 25 to 13. A high threshold of 0.99 was chosen to reduce the number of false-positive assignments. In the final composite pigment labeled map, the classes are color coded and labels as shown in Fig. 3b. The black background represents spatial pixels where none of the 25 labeled pigment classes had a probability at or above 0.99. Inspection of the composite map and color image reveals not all of the pixels were assigned to a pigment class. Decreasing the threshold from 0.99 to 0.85, as shown in the Additional file 1: (Figure S3), did assign unclassified areas to the correct pigments, but at the expense of increased false positive identifications (e.g. parchment classified as lead tin yellow). As noted, the areas of flesh were not included in the training datasets, thus no labels were assigned to the flesh. Nevertheless the majority of the painted areas have been assigned to a labeled pigment class.
The composite color coded pigment labeled map of the Pentecost obtained using the traditional methods, the truth map, is shown in Fig. 3c and labeled pigments found in these classes is given in the 1st column in Fig. 3b. A detailed table summarizing the information used to identify the pigments in the spectral classes found using the ENVI-SHW is given in the Additional file 1: Table S2. The colors of the labeled classes were chosen to roughly represent the color of the actual paint. The 1D-CCN model’s color composite map, displayed in Fig. 3d, used a color scheme where the same color is used as the truth map if pigments were the same, which can also be seen in the second column of Fig. 3b. Comparing Fig. 3c, d (or the two columns of Fig. 3b) shows that the 1D-CNN model correctly labeled the pigments in most of the paints. For example, the paints dominated by a single pigment – azurite, lead tin yellow, gold, ochres, red lead, vermilion, green earth and red lake—were all correctly labeled.
For mixed pigments the 1D-CNN model provided both correct and some incorrect assignments. The 1D-CNN model correctly labeled pixels when the degree of saturation of a color varied over a fairly large range, for example the high and medium saturated blue robes. In both colors, the same primary pigment, azurite, was used but mixed with varying amounts of lead white. For the two areas where ultramarine and azurite were used together, the lighter portion of the dome directly above Mary and the lighter blue robe of the apostle in the bottom right, the 1D-CNN model only correctly labeled the lighter portion above Mary, but not the very pale (unsaturated) robe. Interestingly, the light blue robe of the apostle at the bottom right of Fig. 3d identified a small feature represented by only a handful of spatial pixels as part of the “Red lake” pigment category (shown in pink in Fig. 3d), which at first glance, appears as though it might represent a miss-classification. However, after further visual investigation, this allocation was confirmed: in the areas classified as “Red lake,” reflectance spectra do indeed indicate that an organic red colorant may be present as a layer over the blue and lead white mixture to render the shadow folds in the robe.
The green paints of the robes proved the most challenging for the 1D-CNN model. The truth map as well as magnified examination of the painting shows a yellow green-base layer onto which a deeper green paint was layered, which helps define the three-dimensional shape of the green-robed figure at bottom center. The yellow-green base paint was found to be a mixture of lead tin yellow (type II), ultramarine, and likely a copper-containing green pigment (see Additional file 1: Table S2) and the deeper green as a mixture of lead tin yellow with an unknown copper green. Neither of these mixtures is present in the training dataset, however visual inspection of the mean spectra of the yellow-green paints in the dataset indicate the best spectral match would be with lead tin yellow mixed with azurite, due to the weak reflectance maximum at \(\sim 730\,\hbox {nm}\).
There are two other small details where the 1D-CNN provided pigment labels which prompted further investigation. These are illustrated in Fig. 4. The first concerns the left vertical portion of the red border. The top, right, and bottom part of the red outer border show a sharp inflection point at 564 nm, indicative of red lead. The RIS spectrum of the left vertical border (as pointed out by the green bifurcated arrow in Fig. 4a) shows a sharp inflection at 558 nm consistent with red lead, although blue shifted, but it also shows a weak reflectance peak at approximately 740 nm and rising reflectance starting at 850 nm.
These results suggest the presence of a second pigment along the red outer border although assignment by RIS alone is not possible. The 1D-CNN model recognized a difference between the left edge and the other sides of the red outer border, although it labels the left edge as ochre, rather than red lead, azurite. Inspection of the copper (Cu) elemental distribution map obtained from XRF mapping shows that copper is associated with the blue azurite inner border as shown in Fig. 4c. On the border’s left edge, copper is present in a wider line than what is currently visible in the color image, and indicates azurite is present below the left portion of the red outer border. Visual inspection of the color image shows some blue paint is just visible at the top edge of the border (green arrow) (Fig. 4b). Thus, while not correctly assigning the pigments (since this combination of red lead and azurite was not in the training dataset), the 1D-CNN model did assign the most logical pigment based on the RIS features, and correctly noted the distinction between this area and the remainder of the red lead border.
The second detail of interest is the shadowed side of the white square spire (Fig. 4d–f) which appears as a light gray blue in the color image and was labeled as “indigo” by the 1D-CNN model, shown in teal in detail in Fig. 4d. This area appears to actually contain a small amount of a copper-containing pigment (likely azurite, since the area has a blue-gray cast), as suggested by the copper distribution obtained from XRF mapping (in Fig. 4f). This shadowed area was missed in the classification step for the truth model. Spectra from this area have an overall lower reflectance (by a factor of 2) and weak absorption features that suggest a small amount of earth pigment was additionally added to the white. Taken together, the RIS and XRF data suggests that the area may actually be a complex mixture of lead white, ochre, and trace amounts of azurite. This three-part mixture is not in the training set, so although the shadowed side of the spire was incorrectly ascribed to the indigo class, the 1D-CNN model distinguished a difference between this area and the rest of the white spire.
To further test the robustness of the 1D-CNN model a second painting, which comes from a Choir Book (Gradual) series painted by Lippo Vanni, Saint Peter Enthroned, c. 1345/1350, was analyzed. Vanni, while from Sienna rather than Florence, is likely to have been familiar with the painting techniques and pigments used by the Florentine artists who did the paintings for the Laudario of Sant’Agnese.
As in the case of the Pentecost, a pigment-labeled truth map was constructed from first creating classification maps based on RIS spectra (400 to 950 nm) using the ENVI-SHW algorithm and then by fusing results from point analysis methods in order to turn the classification maps into labeled pigment maps (see Additional file 1: Table S3 for details). The 1D-CNN model was applied to Saint Peter Enthroned to determine the model’s generalizability to a painting not in the Laudario, but which is expected to contain similar materials. The reference color image, truth and 1D-CNN composite maps along with the color-coded pigment labels are given in Fig. 5. The data demonstrate that the paints dominated by a single pigment were correctly identified even when lead white was present. Specifically the areas containing azurite, lead white, vermilion, and red lake were all correctly labeled. The areas of gold leaf, and the areas of exposed bole where the gold leaf is gone, were also correctly identified as gold and ochre (the primary coloring material of the clay bole underneath the gold), respectively. The 1D-CNN model incorrectly labeled the yellow as lead tin yellow although the truth pigment map indicates that a yellow lake is present, however yellow lakes are not present in the training dataset. The truth map shows the dark modeling of the richly decorated red cloth over St. Peter’s throne was painted with vermilion while the lighter parts were painted with a mixture of vermilion and red lead. The 1D-CNN model correctly labeled the vermilion. However, the mixture was labeled as only containing red lead because these areas had sufficient red lead character to differentiate them from pure vermilion, since the mixture was not in the training set.
There are three sets of mixed pigments in Saint Peter Enthroned, two greens and an orange-red. As shown in the truth map, the green paints (Fig. 5c) are made from a yellow lake with azurite denoted with a lighter green, and with a yellow lake, azurite, and indigo for the cooler, darker green. The labeled pigments returned from the 1D-CNN model (Fig. 5d) returned two greens composed of a yellow mixed with a blue pigment and the model returned the correct blue pigment in both cases. However, since no mixture of a yellow lake with these two blue pigments existed in the training data, the model gave as the best match lead tin yellow mixed with the specific blue pigment. This is not surprising as the spectral shape is dominated by the blue pigment present. The labeled composite truth map shows that the red border contains a mixture of red lead and vermilion, just like the lighter red portion of the cloth over the throne. The 1D-CNN model correctly identified these two pigments individually in the border, identifying primarily red lead on the right side of the image, and vermilion on the far left. The model did not classify them as a mixture since there was no mixed red lead and vermilion class in the model. This result reinforces the notion that identification from the model can only be as exact as the training data. As such, these results will always need to be presented with some indication as to the limits of interpretability. However, as more paintings are studied, the training set can be augmented to develop a more robust solution.