Indigo and Prussian blue are the two constituents of the blue ink used for the key-block. One of the particularities of the mixture is that the two colorants do not present the same stability. Indigo is known for being less lightfast than Prussian blue [22,23,24]. Therefore, it is important to make sure that all prints analyzed do not present significant differences in light-induced degradation, leading to an alteration of indigo. Consequently, all prints presenting apparent signs of color discoloration (such as fading and yellowing) or degradation were removed from the statistical model. Orpiment, a yellow pigment commonly used in Japanese woodblock prints [8,9,10] is also well known for its light sensitivity leading to the formation of arsenic oxide [25, 26]. Previous study showed that, prior to forming arsenic oxide, the orpiment crystals undergo structural modification leading to subtle but characteristic changes in the Raman signal [27]. This characteristic was also used to assess the light-induced damages to the prints. If significant light-induced damages were identified during the Raman study, the prints in which they were found were removed from the study. After careful observation, 141 prints from the five different collections were included in the statistical model.
In addition to the blue outlines, the paper support as well as any other colorants, including red, yellow, green, blue, orange were analyzed. This was important as, similarly to the ink composition of the outline, in which no major differences are expected within a single batch (but notable differences are expected between batches), no pigment variations are expected within similar prints realized at part of the same batch. It is however difficult to examine all colors for each of 141 prints separately. Therefore, PCA was applied to the FORS data set. This extensive data set includes all FORS spectra collected on the various colors in the full set of prints. The PCA plot for this full data set is given in Fig. 2. PCA is a data reduction technique which allows for an easier visualization and understanding of a complex dataset. This technique has been successfully used in the field of cultural heritage [28,29,30,31]. PCA plots are divided in four quadrants, which will express how the variables are correlated. Variables inversely correlated will be positioned on opposite sides of the plot origin. Furthermore, the distance to the origin also conveys information. The further away from the plot origin a variable lies, the stronger the impact that variable has on the model. The principal components (PCs) depend on the dataset and correspond to the new variables, which explain the most variations of the dataset [32]. Figure 2 obtained on the entire dataset clearly shows trends based on the colors. Yellow pigments, orange pigments, and the paper each create overlapping but clearly identifiable groupings in the lower right quadrant, indicating that little variation is observed throughout the entire print run. Loadings plots associated with PC-1 and PC-2 (Additional file 1: Fig. S37), show that these groupings share similarities in the 450–650 spectral range. Red pigments, while clustered mostly along the yellow/orange/paper quadrant, also present a smaller sub-grouping in the lower left quadrant. This shows that several types of pigments were most likely used. They were identified as iron oxide red, vermilion, and minium using Raman spectroscopy and FORS. Minium and vermilion, due to their sigmoid-shaped spectra with inflection point at 565 and 600 nm, respectively, are most likely grouped together along with the yellows, oranges, and the paper, which are also characterized by similarly sigmoid-shaped reflectance curves. Greens, blues, and the outlines are the colors presenting the most extensive range, as suggested by their scattering throughout three quadrants of the PCA plot (upper right and left and lower left). Loadings plots (Additional file 1: Fig. S37) suggest a significant influence of the 700–740 nm range, most likely related to the indigo. Nonetheless, some of the blue found along the outlines could actually correspond to dark blue hues, in which both Prussian blue and indigo are found in admixture to create darker shades of blue, as previously demonstrated in a recent publication [9, 10]. Blues and greens appear mostly scattered in the upper right quadrant, without clear sub-groupings. The outlines, even though scattered through the lower left quadrant, do seem to arrange themselves in smaller groups. To investigate the subgrouping of the outline data, PCA analysis was further applied on the outline dataset only, discarding all data from the other colored areas.
The PCA plot for the outlines-only data set is given in Fig. 3.
According to Fig. 3, three primary clusters can be identified: one composed of six impressions in the upper right quadrant, one with three impressions in the lower right quadrant and another one with most of the prints spreading over the four quadrants. The associated loading plots are given in Additional file 1: Fig. S38 and shows the great influence of the 720–900 range for the negative values of PC-1. This range includes the inflection point associated with the indigo and the slope associated with both indigo and Prussian blue. It is important to note that PC1 represents 91% of the overall variance of the data set. This is easily explained by the relative similarity of the entire data set. Indeed, while the data set used for the PCA in Fig. 2 regrouped all pigments found in the prints, the data set used for Fig. 3 regroups only the data for the Prussian blue/indigo mixture. Therefore, most data here will present very similar features (max absorption at ca. 660 nm, inflection point at ca. 720 nm and a slope in the 700–900 nm range). The main difference within the data set will be the degree of the slope in the 700–900 nm range due to the variation in indigo/Prussian blue mixtures. For this reason, PCA may not be the method of choice to cluster the data obtained on the outlines. HCA using the bottom-up approach was then considered as a more suitable alternative. This approach starts each print in its own cluster and highlights cluster pairings based on their similarities. As a result, prints presenting the highest similarities in maximum absorbance, inflection point, and slope will be clustered together while larger dissimilarities such as the variations in slope will prevent different clusters to be merged. The resulting dendrogram is given in Fig. 4. This dendrogram clearly shows that the corpus of 141 prints divides into 9 different clusters. Two of these clusters (clusters 6 and 7) are the same as the two smaller clusters observed in the PCA (Fig. 3) while the main PCA cluster is divided into 7 sub-clusters. The cluster to which each print belongs is given in Additional file 1: Table S1.
Based on our hypothesis that the indigo/Prussian blue mixture used for the outlines is characteristic of each printing session, these 9 clusters represent 9 different production batches. Representative reflectance spectra for each cluster are given in Fig. 5. No major variations of unmixed pigments or colorants are expected within the clusters of prints produced at the same time in the same place outside of the specific individual mixture of Prussian blue and indigo used for the outlines. To test the hypothesis, pigments and colorants found in impressions of the same print belonging to different clusters were compared. As an example, a UV-induced fluorescent yellow colorant was used in JP2957 and JP2555, both in cluster 4, while JP1330, a print from the same design found in cluster 9, features a non-fluorescing yellow for the same details (Fig. 6). This seems to indicate that the computer-aided clustering is accurate. Nonetheless, this confirmation was extended further.
Similar observations were made for the use of safflower red along with vermilion in the lightning bolt in JP2961 and JP11, both in cluster 3, while only vermilion was used in the lightning bolt for JP2567, which is found in cluster 9. Vermilion was also used alone for the red areas of the kimono in JP1330 in cluster 9, similarly to JP2567. Apart from the red and yellow organic colorants, arsenic sulfide pigments, namely natural orpiment and amorphous (or semi amorphous) arsenic sulfide were also used to verify the validity of the different clusters. From these analyses, it appears that no variations were observed between prints within a single cluster, proving the validity of the clustering.
When looking at the actual reflectance spectrum of the outlines, it is interesting to note that all prints found in cluster 6 correspond to the spectral signal of the two best Met impressions of the Great Wave. These prints are characterized by impeccable line quality and minimal or no wear in the key-block printed areas and are indicative of an early production process. In fact, these two prints are widely considered to be among the earliest impressions of the Great Wave. In contrast, cluster 7 includes prints for which the outline appears to be composed only of Prussian blue. These prints are considered early twentieth century reproductions and were not created during the original production period of 1830–1832. Significant variations in the composition, including the intensity of the pink sky as well as the execution of the carved lines of the breaking wave, also indicate that these prints were created from newly carved blocks. These observations are part of the connoisseurship approach, which was applied to sort out the chronology of the various clusters created through multivariate analysis of the FORS data.
Once the clustering proved valid, the connoisseurship approach was applied to a selection of impressions in order to determine the relative order of the production process and to compare the various clusters. This was previously applied to the three impressions of Mishima Pass in Kai Province from the Met collection (JP2556, JP2970 and JP18), leading to the conclusion that JP2556 was created prior to JP2970 and JP18, the latter having been created later, along the additional 10 views ordered around 1835 [9]. This allowed for the conclusion that cluster 3 was printed later than cluster 8. A similar approach was applied to other impressions in the Met collection.
As suggested by Fig. 7, close observations of the impressions of Enoshima in Sagami Province (JP22 and JP2977) show that JP22 (cluster 4) presents less wear to the key-block than JP2977 (cluster 1), indicating that cluster 4 was realized prior to cluster 1. A similar procedure was applied to other prints series from the Met collection. Consequently, based on the impressions of Honganji at Asakusa in Edo (JP1323 and JP2996), cluster 4 was created before cluster 3; cluster 9 was created before cluster 5 based on the impressions of In the Mountains of Tōtomi Province (JP25 and JP2966); cluster 5 was created before cluster 4 based on the impressions of Lake Suwa in Shinano Province (JP2564 and JP2965); cluster 9 was created before cluster 4 according to the impressions of Morning after the Snow at Koishikawa in Edo (JP1330, JP2550 and P2957); cluster 9 was created before cluster 3 based on the impressions of Storm below Mount Fuji (JP11, JP2567 and JP2961); cluster 5 was created before cluster 3 according to the impressions of Surugadai in Edo (JP1287 and JP2995); cluster 8 was created before cluster 2 according to the impressions of Lake at Hakone in Sagami Province (JP17 and JP2980); cluster 5 before cluster 1, itself before cluster 8 based on the observations of Tsukudajima in Musashi Province (JP23-8, JP26-5, JP2563-5 and JP2990-1); cluster 5 was created before cluster 4 according to the impressions of Ushibori in Hitachi Province (JP27-4, JP2565-5 and JP2964-4); and cluster 6 preceding all cluster based on the impressions of Under the Wave off Kanagawa (JP10, JP2569 and JP2972).
These various observations allow for a chronological order of production of the various clusters as follows, from earliest to latest: cluster 6, cluster 9, cluster 5, cluster 4, cluster 1, cluster 8, cluster 2 and cluster 7, this last cluster containing prints identified as being early-twentieth century reprints from historic or newly carved blocks. Cluster 3 is difficult to place in the chronology due to the lack of multiple comparisons. It appears to have been created after cluster 4 (and therefore, also clusters 6, 9 and 5) but it is difficult to know if it has been created before or after clusters 1, 8 and 2. However, pure Prussian blue outlines as observed in cluster 7 are a significant indication of early twentieth-century reprints. Therefore, the Prussian blue/indigo nature of the outlines in cluster 3 represents a clear indication that the cluster has been created prior to cluster 7. The final order of the print production process for the various clusters—excluding cluster 3—is given in Fig. 8.
While not included in the statistical model, the FORS signal obtained from blue outlines in superb impressions of the Great Wave held by several other institutions corresponds almost perfectly to the Metropolitan Museum finest impressions of the same print. This would suggest that these additional impressions belong to a cluster of early works highlighted by the statistical study. While it appears easy to add impressions to the two extreme clusters (cluster 6, the early impressions of the Great Waves and cluster 7, the late nineteenth/early twentieth century reproduction using only Prussian blue in the outlines), the statistical study could be expanded and therefore refined using an identical FORS instrument as the wavelength spacing is key to undertake the statistical treatment.