A novel methodology for writer (hand) identification: establishing Rigas Feraios wrote two important Greek documents discovered in Romania

Mamatsis, Athanasios Rafail; Mamatsi, Eirini; Chalatsis, Constantinos; Arabadjis, Dimitris; Kampouri, Pandora; Papaodysseus, Constantin

doi:10.1186/s40494-023-00873-z

Download PDF

Research
Open access
Published: 23 February 2023

A novel methodology for writer (hand) identification: establishing Rigas Feraios wrote two important Greek documents discovered in Romania

Athanasios Rafail Mamatsis¹,
Eirini Mamatsi¹,
Constantinos Chalatsis¹,
Dimitris Arabadjis²,
Pandora Kampouri¹ &
…
Constantin Papaodysseus¹

Heritage Science volume 11, Article number: 38 (2023) Cite this article

1754 Accesses
Metrics details

Abstract

The main goal of the present work is to determine the hand that has written two newly discovered documents in Romania. For giving the proper answer, the authors introduced the notion of “Ideal Representative”, namely of an object that very well represents the corresponding ideal alphabet symbol that a writer had in his/her mind when writing a document by hand. Moreover, the authors have introduced a novel method, which leads to the optimal evaluation of the Ideal Representative of any alphabet symbol in association with any handwritten document. Furthermore, the authors have introduced methods for comparing these Ideal Representatives, so as a final decision about the hand that has written a document may be obtained with a highly considerable likelihood. The related analysis manifests that the two documents discovered in Romania in 1998, belong to the great personality of Rigas Feraios. The presented method of automatic handwriting Identification seems to be of general applicability.

Ancient Yi Script Handwriting Sample Repository

Article Open access 30 October 2024

Fractal algorithms and RGB image processing in scribal and ink identification on an 1819 secret initiation manuscript to the “Philike Hetaereia”

Article Open access 31 January 2023

Ancient Tamil inscription recognition using detect, recognize and labelling, interpreter framework of text method

Article Open access 30 December 2024

Introduction

The aim of the research presented here is to give an as objective as possible and quantitative answer to the important question, which one of the following five documents had been written by the very same hand of Rigas Feraios or not; the latter was one of the grater personalities who drastically influenced the history of Greece, but of all Balkan Nations, too, in the last three centuries (see 1.2 below). In essence, the present work deals with automatic writer identification, in association with the following five (5) handwritten texts:

1.
“Compilation of problems in Physics” (“Φυσικής Απάνθισμα”) hence forth, called FYSAP, for brevity. We would like to strongly emphasize that FYSAP is the only document, which is historically certain and unambiguous that it has been written by the hand of Rigas Feraios.
2.
“The Greek Constitution” and “the Thourios” (“Σύνταγμα και Θούριος”) appearing in the same document; in all the subsequent work, we shall use the common abbreviated name SYNTAG for both these texts.
3.
A third document, unambiguously written by the hand of another great Greek Politician Eleftherios Venizelos. This text is kept in National Research Foundation “Eleftherios Venizelos” [1]; we shall refer to it as ELVEN.

The other two documents which practically triggered the work in hand, are more extensively referred to in “Unpublished Documents Probably Associated with Rigas” subsections and “Scientific Dispute Concerning the Actual Writer of “The Saganaki of Madness” and “The Tested Friendship”” that follow immediately. In subsection “Scientific Dispute Concerning the Actual Writer of “The Saganaki of Madness” and “The Tested Friendship”” we shall integrate the goal of the present work.

Unpublished documents probably associated with Rigas

In 1998, exactly two hundred (200) years after the cowardly assassination of Rigas Feraios, the Romanian scientist Lia Brad Chisacof [2] presented a handwritten document, which, according to her opinion had been written by Rigas. This document includes two unknown and obviously unpublished literary works:

a.
The comedy “The Saganaki of Madness” (“To σαγανάκι της τρέλας”), hence forth symbolized SAGAN; we note that “the saganaki” is a specialty of the Greek kitchen based on fried cheese. The content of SAGAN is a satire, mocking the governor of Walachia Nikolaos Mavrogenes, whose secretary was Rigas. Mavrogenes was an aliterate, brutal, vulgar, gauche, uncultivated and psychologically unstable individual. The author of this play, i.e., of SAGAN, scoffs all these defects of Mavrogenes, sometimes following the style of the grate Moliere; however, the author bases his humor and jokes on multilingual expressions, thus achieving to go far beyond the style of Moliere.
b.
The short novel “The Tested Friendship” (“Δοκιμασμένη φιλία”), to which we will refer with the name FILIA in the following. Both this works have been published in Bucharest on behalf of the “Institute of Southeast European Studies” introduced and translated in Romanian by Lia Brad Chisacof [2].

Scientific dispute concerning the actual writer of “the saganaki of madness” and “the tested friendship”

There is a serious scientific dispute, concerning the hand that has written the aforementioned newly discovered documents, we have called SAGAN and FILIA ([2]). Thus, the automatic identification of the hand that has written these two documents is, substantially, the main goal of the present work. Equivalently, we will test the hypothesis if documents SYNTAG, SAGAN and FILIA were written by the hand of Rigas Feraios or not; in attempting that, we shall employ the fact that FYSAP and ELVEN have been unambiguously written by the hand of Rigas Feraios and Eleftherios Venizelos respectively. This testing will be accomplished via an improved novel extension and proper modification of a methodology introduced by the authors in connection with the writer identification of ancient Greek (Hellenic) inscriptions and byzantine codices [3, 4].

A brief description of the life and the accomplishments of Rigas Feraios

Rigas Feraios-Velestinlis (Ρήγας Φεραίος-Βελεστινλής), according to most scholars and not only, was the greater personality, intellectual, author, revolutionary in the Balkans during the late 18th and early nineteenth century. He, actually, introduced the Enlightenment in the Balkans, especially among the Greek population. Rigas was born in the Greek village Velestino, the ancient Feres (“Φεραί”), in 1757. His village was then occupied by the Ottoman Empire; nowadays, Velestino-Feres belongs to the modern Greek region of Thessaly.

Next, Rigas was first educated at the schools of Zagora and Ampelakia, which were famous in this era. After his graduation, he became a teacher in the village “Kissos”, where he started expressing his opposition to the Ottoman Empire, actively. At the age of twenty he killed a prominent Turk, because he had treated him as a Greek slave. Due to this action, Rigas was forced to flee in “Litohoro village” on the celebrated Mount Olympus, where he joined the group of rebels led by Spiros Zeras. From there Rigas went to mount Athos and more specifically in the monastery of Vatopedion. This monastery had, and it still has, a very rich library; Rigas ensured a free, unlimited access to this library and, consequently, he drastically improved his knowledge and expertise in various disciplines. In 1785, Prince Alexandros Ypsilantis (the Ambassador of Rossa in the Ottoman Empire) invited Rigas to Constantinople (Istanbul), for further studies. Next, in 1788 Feraios went to Wallachia, where he became the secretary of the Ruler Nicholas Mavrogenes. There, he was informed about the French Revolution, and very soon, he conceived the idea that such a revolution could take place in the Balkans against the Ottoman Empire from the populous Cristian community. In the meanwhile, the Ottoman Empire has lost the war against Rossa in 1792 and considered Nicholas Mavrogenes to be responsible for the defeat of Turks in Wallachia and, as a result, Mavrogenes were decapitated. Thus, Rigas was, once more, forced to flee in Vienna, the Austral capital and he started his very important revolutionary actions.

In Vienna, in the printing premises of the Greek brothers Pouliou, Rigas published the following books-works:

i.
“The School of Delicate Lovers” (“Σχολείο των ντελικάτων εραστών”).
ii.
“Compilation of problems in Physics”, “Φυσικής Απάνθισμα”, a collection of remarks and problems on Physics,
iii.
“Ηθικός Τρίποδας”, consisting of three important documents fully supporting the Enlightenment; two of these texts were written by Marmodel and they had been translated by Rigas in modern Greek.
iv.
“Ανάχαρσις”, which was a translation in modern Greek of a famous book by Loukianos written in the second century B.C.; the book refers to the proper citizen education (Civics) and in particular of the young ones, as well as the role of gymnastics in Physics.
v.
The “Thourios” (“Θούριος”), a kind of march, aiming at giving courage at the enslaved Greek and Balkan Christians, in order to uprise against the Ottoman Empire. One of the more well-known, famous distiches of Thourios is the following:

“it’s better to live free for one hour,

Than forty years in prison, under slavery”.

vi.
“The constitution of the Greek democracy”. In this book, Rigas makes the very advanced for his era proposal, that the Greeks should follow the steps of the French rebels and build a “Popular Democracy”, very similar to the French democracy.
vii.
“The Revolutionary Declaration” (“Η Επαναστατική Προκήρυξη”). In this declaration, Rigas tries to persuade all Greeks and the Cristian Balkans to uprise against the Sultan’s tyranny; in this declaration, Rigas also tries to arouse the Turkish population, appealing to their democratic feeling.
viii.
“The Charta”, a map of the Balkan peninsula, which extends to the south of Danube River.

Unfortunately, the content of these publications annoyed the Ottoman Empire too much and, as a result, Turkish secret agents murdered Rigas Feraios with an abominable manner.

We should emphasize that, in connection with most of the aforementioned publications, there are handwritten documents, too. However, only for the “Compilation of problems in Physics” (“Φυσικής Απάνθισμα”) there is sufficient historical evidence that it has been undeniably written by the hand of Rigas.

State of the art in automatic handwriting identification

In the last years, given the great evolution of computers, identification of the hand that has written a document may be achieved via the use of computational algorithms. In fact:

Connected-component contours and edge-based features of uppercase Western script are considered for offline automatic Writer Identification by the authors of [5]. More specifically, the authors propose that, for automatic Identification of upper-case Western letters, the combination of the connected-component contour codebook and its probability-density function of shape usage, together with the edge-orientation distribution, offers really satisfactory results. In [6] a technique is proposed which divides a given handwriting into small fragments; then, each such fragment is considered to be a texture. In particular, in all these fragments-textures, three texture descriptors, namely Local Binary Patterns (LBP), Local Ternary Patterns (LTP) and Local Phase Quantization (LPQ) are considered, and each handwritten fragment is represented by the histograms of these descriptors. Comparison-classification of these descriptors is the basis of the writer identification. In [7], the authors propose quantitative methods for the classification of calligraphic style, on a statistical basis. The proposed classification and quantification methods rely on specialized local and global visual style features. [8] use texture-based schemes for Writer Identification. More specifically, the authors combine Delta encoding and oriented Basic Image Feature Columns to achieve writer identification. In [9], the Levenshtein edit distance based on Fisher-Wagner algorithm is used to calculate the cost of transforming one handwritten word into another. He et al. in [10] apply the “SOTM” method of [11] in order to generate a codebook which contains the temporal information of the handwritten patterns. In this way, one may render the historical document dating, a standard pattern recognition problem. The authors of [12] use two different features’ classes in order to identify the writer of handwritten excerpts from their binary images. In fact, the authors use two curvature-free features for writer identification based on the run-lengths of general patterns and the joint distribution of the relation between orientation and length of a set of line segments extracted from contours of ink traces. In [13], the authors try to determine the minimum number of hands that could have written a set of 16 inscriptions from the “Judahite desert fortress of Arad”; to achieve that, they use statistical inference over the distribution of the distances between mixed features representations of letter shapes, e.g. angles between strokes and character profiles. In [14], structural and textural features, which express the hand-writing visual similarity, are extracted and used by experts to interactively cluster the documents with a manually defined feature subset. The authors of [15] use computer-vision algorithms and statistical inference methods in order to identify fragments that might originate from the same codex. The developed system consists of a subsystem that compares handwriting, as well as a subsystem that considers the appropriate cataloging data, as extracted from the images by a suitable method. Wolf et al. [16] look for eventual joins between catalogued excerpts, using a mixture of local descriptors and learning techniques. The authors developed a benchmark, the “Genizah” one, in which they have employed various vectorization methods and similarity measures, in order to reach a final statistical inference. In [17], a junction detection method is proposed, applicable in writer identification. In fact, the authors perform skeletonization of the letters, in order to determine the “forks”; at each fork, they evaluate the probability $S(\theta )$ that direction $\theta$ indicates one of the branches of the junction. Finally, they consider probability distribution of the junctions as a global feature for each writer based on a learned codebook. A new system for writer identification of medieval manuscripts is presented in [18]. The system is based on features from layout analysis, such as the way the scribe is distributed in each row of the text according to the ruling, intercolumnar distance, upper and lower margin, interlinear spacing, peak number, etc. In [19], Deep Learning (DL) algorithms and classical machine learning approaches are experimentally compared; the authors deduce that the DL-schemes outperform or are equivalent to the latter ones. The author of [20] proposes a new Residual Swin Transform Classifier for offline Word-Level Writer Identification. [21] investigates the effectiveness of textural measures in characterizing the writer of a handwritten document. More specifically, the authors introduce a representation of the local binary patterns (LBP column histogram) and they combine it with the oBIFs column histogram to enrich the representation. Next, they use these local patterns to characterize the writer, via the SVM classifier. Classification is carried out using the SVM classifier. [22] employs Global-context residual recurrent neural networks for writer identification. The authors combine the global-context information extracted by a convolutional neural network and local fine-grained information extracted by a recurrent neural network. The authors of [23] use Deep Learning Techniques for Writer Identification. Indeed, FAST key points and Harris corner detector are employed to identify points of interest in the handwriting. Subsequently, small patches centered around these key points are fed into a deep convolutional neural network; hand-writer identification is based on maximum likelihood considerations. The article [24] investigates the impact on writer identification of four different techniques that entail Directional Hinge feature extraction methods.

In the system described in [25], features that capture the distribution of the strokes' thickness are concatenated with deep features, computed by a succession of convolutional blocks; in this way, shallow softmax nets may approximate the classification of handwriting excerpts. The system has been trained with a 4/1 ratio of the training to test set and for a fixed number of writers. In [26], an analogous practice of concatenating deep global and “restricted” features is employed. The restricted features are computed locally in fragments of the input images, in order to capture graphemes, according to the method introduced in [27] and widely and effectively used thereafter ([28]). The system presented in [29], is also based on restricted deep features. There, the deep convolutional features are computed separately on words and connected components of the handwriting. Then, the optimized feature maps are concatenated and set as the input of a feed-forward network, which, in turn, is trained to be the documents' classifier. In an analogous manner, the system of [30] restricts the deep features to be computed and optimized on the separate lines of text. In turn, the employed feed-forward classifier is not trained separately on the basis of the optimal feature maps. The authors of [31], restrict deep features on very elementary data of the handwriting, i.e. the individual letters. The developed architecture of the feature extraction network combines different deep learning mechanisms, in order to represent the characteristics of structured data, such as the letters’ contours. In [32], the idiosyncrasy evaluation that experts do on handwriting excerpts guides the training of a reinforcement learning model, designed to detect sequences of handwriting excerpts of maximal idiosyncrasy scores. In turn, these excerpts are the data upon which the convolutional features that feed the deep classifier are computed. The authors of [33] use path integral signatures and SIFT features to represent local properties of the handwritten shapes in a translation-free, scaling- free and reparameterization-free way. Then, codebook learning techniques are applied to these base features in order to extract global features vectors of as limited as possible dimension. In [34], a more diverse and writing style—specific set of graphometric features is employed to represent the handwriting word-wise. Next, principal component analysis (PCA) is used to affinely map the data in a lower dimensional space and a shallow network is trained to perform the writer classification task. The authors of [35] employ a deep convolutional neural network, based on modified pre-trained networks, that is designed and developed to extract features from raw data hierarchically.

Though the aforementioned approaches vary a lot in the way that graphological and/or geometrical representations of the handwriting are fused within the deep learning prototype, there is a common intrinsic restriction due to the necessity of a training procedure. The classification procedure either is independent to the identities of the selected features space or it imposes restrictions on the induced deep features. In the first class belong methods like [33, 34], and, partially, [25] and [32], that do not implicate the target classification in the feature extraction. In the second class, belong methods analogous to [26, 29] and [30], where the deep features are trained together so as to optimize the classifier's accuracy.

A summary and the novelties of the introduced methodology

Considering the aforementioned State of the Art, we believe that both the entire methodology presented here, as well as various aspects of this approach are substantially novel. In particular:

a.
We form proper bundles of optimally fit realizations of every alphabet symbol, separately (see Optimal Matching of Any Two Realizations of the Same Alphabet Symbol Section).
b.
We render all contours of each such bundle equinumerous, in the sense that we guarantee that they all consist of the same numbers of pixels, without practically influencing their morphology (see “The “Ideal Representative” of an Alphabet Symbol, Concerning a Document of a Specific Writer” Section).
c.
We evaluate the curvature at the center of each pixel of all contours, via proper polynomial approximation of the contour. Next, we define a biunivocal correspondence between pixels, with pretty close value of curvature (see “The “Ideal Representative” of an Alphabet Symbol, Concerning a Document of a Specific Writer” Section).
d.
Among all pixels of common curvature, we evaluate the mean value of the x and y coordinates, thus obtaining an average curve, which we call “First-Version Representative” (see “The “Ideal Representative” of an Alphabet Symbol, Concerning a Document of a Specific Writer” Section).
e.
By repeating the aforementioned procedure in connection with all these First-Version Representatives, we obtain a final “Ideal Representative”, which we firmly feel that best represents the form of the alphabet symbol the Writer had in his mind, when writing the specific document (see “The “Ideal Representative” of an Alphabet Symbol, Concerning a Document of a Specific Writer” Section).
f.
For each alphabet symbol separately, we compare the Ideal Representatives that correspond to two different document parts and then, we apply a novel statistical procedure for deciding if these two documents have been written by the same hand or not (see “Identification of Handwriting Based on the Ideal Representatives” Section and “A Statistical Approach for Identifying Rigas Feraios’ Handwriting” Section).

We would like to emphasize that the entire aforementioned procedure is applied to a properly selected set of eleven (11) greek alphabet symbols. The reason for and the method of this selection will be explicitly referred to in the end of present Section.

Conclusively, the aforementioned procedure allows for rigorously determining rules for the maximum likelihood classification of documents into writing hands. This formulation of the writer identification task is totally dataset-independent since the representation of the writing style via the corresponding Ideal Representatives and the statistical classification of the documents apply to any handwriting classification task without any need of training. Finally, there is no need for knowing the number of the independent writing hands that the classification seeks for, which is an intrinsic necessity of the supervised approaches based on deep learning.

A first, concise flowchart of the methodology previously described, is given below (Fig. 1).

We would like to emphasize that, although all aforementioned actions and especially Action 1, may be applied to all alphabet symbols, we have restricted their application to a rather limited number of letters, according to the following syllogism:

i.
The selected alphabet symbols must manifest a considerable complexity, so that they may sufficiently characterize the Writer. Thus, for example, the Greek “ι” (“iota”), “κ” (“kappa”), “ν” (“nu”), “ο” (“omikron”), “τ” (“tau”) etc., are rather easily reproduceable and so we expect to be correspondingly similar among different writers.
ii.
At the same time, the chosen letters must have a relatively high frequency of appearance in Greek documents, so as their processing offers statistically significant results. Therefore, for instance, alphabet symbols such as “ζ” (“zeta”), “ξ” (“xi”), “φ” (“phi”), “ψ” (“psi”) etc., have a very promising complexity, but, as a rule, they are rarely encountered in pages of Greek documents and, consequently, we do not expect that they may offer statistically significant results.

We would like to stress, that the introduced approach is applicable to the Latin alphabet symbols, (at least) too, as we shall demonstrate in future manuscripts.

In addition, we must make clear that the method presented here, by no means gives an answer concerning the authorship of these literary works. On the contrary, the goal of the introduced research, is to test the hypothesis that two or more documents have been written by the same hand. In order to test the authorship of a certain document, it is necessary to compare each style of writing with other texts, which are known to belong to a specific writer; this requires another, substantially different, approach in the disciplines of Mathematics and Computer Engineering. We plan to deal with this problem in connection with the aforementioned documents, in a future manuscript.

A first stage processing of the realizations of any alphabet symbol

We consider a part $PD$ of any document $D$, where $PD$ includes a sufficient number of realizations of the alphabet symbol, which is each time treated; the term “sufficient number” will be clarified in the analysis that follows. Moreover, we stress that in the subsequent presentation, we shall use alphabet symbol “α” and its realizations as a generic representative of all alphabet symbols and their realizations. Thus, when, in the following, a statement and/or a method is reported in connection with letter “α”, this means that the corresponding approach holds true for an arbitrary alphabet symbol, too.

The steps of this processing are briefly described below, and the obtained results are manifested in corresponding images:

Step-FSP.1

In order to semi-automatically isolate the realization of any, but specific, alphabet symbol, we have proceeded as follows:

First, we have applied a method quite similar to the one introduced in [36]–[38], so as to determine the different document lines and then to divide each line into connected components. Subsequently, we have determined the points of change of letters’ realizations, following [36] again, and finally we have isolated the desired realization and we embedded it alone into a frame, say ${F}_{L}$ (see Fig. 2).

Step-FSP.2

An automatic image segmentation method developed by the authors ([38]), has been applied to each one of the letters’ frames ${F}_{L}$ obtained in Step-FSP.1.

Step-FSP.3

The contour of each isolated letter has been automatically extracted by means of a method specifically developed by the authors (See Figs. 3, 4), always in ${F}_{L}$. We would like to emphasize that the obtained contours always constitute a union of closed simple Jordan curves, which separate the internal of the letter (its body) of the letter from its background. Thus, for example, most realizations of alphabet symbols “σ”, “β”, “θ”, “ρ” etc. are delimited by an external simple Jordan curve and at least an internal one, enclosed by the first borderline (e. g. see Figs. 4, 5, 6).

Optimal matching of any two realizations of the same alphabet symbol

In this Section, we will state a first criterion of optimal matching (OM) of any two realizations of the same alphabet symbol, belonging either to the same document or to two different ones. In fact, this action includes the following steps:

Step-OM.1

We apply all steps referred to in the previous “A First Stage Processing of the Realizations of Any Alphabet Symbol” Section, thus obtaining the contours of all realizations of an arbitrary alphabet symbol “α” in part $PD$ of a studied document. We randomly choose a realization of “α”, which we will call ${a}_{1}$, with a corresponding contour ${C}_{1}^{a}$ and we temporarily let it be a prototype one.

We parallel translate ${C}_{1}^{a}$ so as its center of mass coincides with the center $\left({x}_{n},{y}_{n}\right)$ of the corresponding frame.

Step-OM.2

We let all other realizations of “α” in $PD$ optimally fit “${\alpha }_{1}$”, by applying the subsequent affine transformations:

A)
Let ${a}_{i}$ be an arbitrary realization in $PD$ with contour ${C}_{i}^{a}$ consisting of ${N}_{i}$ pixels, each one having center coordinates $\left({x}_{i,j},{y}_{i,j}\right)$, where $j$ is the cardinal number of the contour pixel in hand. We parallel translate ${C}_{i}^{a}$ so as its center of mass coincides with $\left({x}_{n},{y}_{n}\right)$.
B)
Next, using the well-known rotation matrix $R=\left[\begin{array}{cc}cos\varphi & -sin\varphi \\ sin\varphi & cos\varphi \end{array}\right],$we rotate ${C}_{i}^{a}$ by the standard method, after translating its center of mass to the origin $O\left(\mathrm{0,0}\right)$. We symbolize the pixels of this rotated version of ${C}_{i}^{a}$ as $\left({x}_{i,j}^{{R}_{O}},{y}_{i,j}^{{R}_{O}}\right)$, where we have employed superscript ${R}_{O}$, in order to indicate that the rotation takes place around the origin $O$.

III)
Subsequently, we apply scaling to ${C}_{i}^{{R}_{O}}$ by factor $\lambda$ via the formula $\left({x}_{i,j}^{S{R}_{O}},{y}_{i,j}^{S{R}_{O}}\right)=\lambda \left({x}_{i,j}^{{R}_{O}},{y}_{i,j}^{{R}_{O}}\right)$.
IV)
We re-evaluate the center of mass of pixels $\left({x}_{i,j}^{S{R}_{O}},{y}_{i,j}^{S{R}_{O}}\right)$ and we move it back to $\left({x}_{n},{y}_{n}\right)$, by the proper parallel translation.
V)
We parallel translate all pixels $\left({x}_{i,j}^{S{R}_{O}},{y}_{i,j}^{S{R}_{O}}\right)$ by the vector $\left(dx,dy\right)$.

Step-OM.3

For each quadruple $\left(\varphi ,\lambda ,dx,dy\right)$, restricted in a properly chosen 4-cube, we let ${C}_{i}^{SRT}$ be the corresponding Scaled, Rotated and Translated version of ${C}_{i}^{a}$, as it has been obtained by the two aforementioned steps OM.1 and OM.2.

Next, we use the following symbolism:

a)
Let $D({C}_{i}^{SRT} )$ be the domain enclosed by contour${C}_{i}^{SRT}$, i. e. its internal.
b)
Similarly, let $D({C}_{1}^{a} )$ be the internal of the contour that played the role of the prototype-fixed one.
c)
For any closed domain, say $D$, in the plane we symbolize its area via $Area(D)$.

Then, we define the subsequent similarity criterion, abbreviated as SC, between “α” realizations ${C}_{1}^{a}$ and ${C}_{i}^{a}$. Actually, this similarity criterion consists in employing ${C}_{i}^{SRT}$ and ${C}_{1}^{a}$ and evaluating quantity

$$\kappa \left( {\varphi ,\lambda ,dx,dy} \right) = \frac{{Area\left( {D\left( {C_{i}^{SRT} { }} \right) \cap D\left( {C_{1}^{a} { }} \right)} \right)}}{{Area\left( {D\left( {C_{i}^{SRT} { }} \right) \cup D\left( {C_{1}^{a} { }} \right)} \right)}}$$

(1)

Quantity $\kappa \left(\varphi ,\lambda ,dx,dy\right)$ constitutes a reliable measure of similarity between contours ${C}_{i}^{SRT}$ and ${C}_{1}^{a}$.

Step-OM.4

Among all computed values $\kappa \left(\varphi ,\lambda ,dx,dy\right)$ in the chosen 4-cube, we select the maximum one, corresponding to the quadruple, say, $\left({\varphi }^{max},{\lambda }^{max},d{x}^{max},d{y}^{max}\right)$. We symbolize this maximum value of κ as ${\kappa }^{max}$; i.e., ${\kappa }^{max}=\kappa \left({\varphi }^{max},{\lambda }^{max},d{x}^{max},d{y}^{max}\right)$.

Step-OM.5

We parallel translate ${C}_{i}^{\alpha }$ by $\left(d{x}^{max},d{y}^{max}\right)$, we rotate it by ${\varphi }^{max}$ and we scale it by ${\lambda }^{max}$, thus obtaining ${C}_{i,OM}^{SRT}$, which Optimally Matches ${C}_{1}^{a}$ (see Figs. 7, 8, 9). In other words, ${\kappa }^{max}$ constitutes a very reliable measure of similarity of the two original contours ${C}_{i}^{\alpha }$ and ${C}_{1}^{\alpha }$.

The “ideal representative” of an alphabet symbol, concerning a document of a specific writer

In the present Section, we shall employ the previous analysis, together with a novel approach, in order evaluate the “ideal representative” or the “platonic prototype”, of an arbitrary alphabet symbol associated with any document part PD. We assume that PD is written by the same hand; we note that we may test this hypothesis, before applying the process that follows. In the subsequent sections, we shall demonstrate that this ideal representative is a very powerful tool for writer identification. An associated flowchart for the Ideal Representative computation is given in Fig. 10 below.

We use the names “ideal” or “platonic” representatives for the following reasons:

a)
We assume that each writer has one or at least a small number of ideal shapes in his/her mind, when writing a document. Evidently, this ideal shape may change during long periods of his/her life; however, we plausibly assume that, in most cases, the process of writing a specific document has a relatively small duration, as far as the letters’ ideal shapes are concerned.
b)
When a person renders a specific alphabet symbol on a writing material, the resulting letter realization is a disturbed version of the corresponding ideal shape the writer has in his/her mind. The reasons that cause this disturbance, usually are pretty numerous: e. g. the psychological mood of the writer, her/his fatigue, the interaction of the writing instrument with the writing material, possible defects of the writing materials and surfaces, the writer’s age and perhaps many more.
c)
However, we may classify the discrepancies among the realizations of the same alphabet symbol generated by the same writer in two classes: i) the causal ones and ii) the erratic ones. The class of the causal discrepancies includes the size and the orientation of the realization as well as its different position in PD each time. The erratic discrepancies are mainly due to the reasons described in b) above.
d)
In order to account for the causal discrepancies, we applied the optimal fitting process described in “Optimal Matching of Any Two Realizations of the Same Alphabet Symbol” Section.

The process described in the present Section, aims at reducing the erratic discrepancies. This is achieved by a proper averaging process, given that averaging of $N$ optimally fit curves reduces the average distance of these curves from the prototype one that has “generated” them, by a factor of $\sqrt{N}$, each time.

We shall apply both methods introduced in the present “The “Ideal Representative” of an Alphabet Symbol, Concerning a Document of a Specific Writer” Section, as well as in “Identification of Handwriting Based on the Ideal Representatives” Section, to a selected subset of eleven (11) greek alphabet symbols and more specifically to letters “α”, “β”, “γ”, “ε”, “θ”, “λ”, “μ”, “π”, “ρ”, “σ”, “ω”, following the analysis made in the end of “A Summary and the Novelties of the Introduced Methodology” Section. We repeat here the two criteria that guided our choice:

a)
We have selected treating alphabet symbols, for which we a priori know/guess that might better convey the writing idiosyncrasies of a hand,
b)
We have focused our methodology on letters that have a significant frequency of appearance in all tested documents.

Suppressing the erratic discrepancies among realizations of the same letter

Consider an arbitrary alphabet symbol, symbolized as “α” and its realizations in PD, where we know a priori and/or we have verified that it has been written by a single hand. We optimally suppress both the causal and the erratic differences among the realizations of “α” in PD, so as to evaluate the “Ideal Representative” (IR) of “α” in this document. In order to achieve that, we apply the procedure consisting of the following steps:

Step-IR.1

We arbitrarily choose a first realization of “α” say “${\mathrm{\alpha }}_{1}$” with contour ${C}^{\alpha ,i}$ and we make all the pairwise optimal matching of “${\mathrm{\alpha }}_{1}$” with the other realizations ${\mathrm{\alpha }}_{\mathrm{i}}$ of “α”, with border line ${C}^{\alpha ,i}$, in PD, where $\mathrm{i}=2, 3,..., {\mathrm{N}}^{\mathrm{\alpha }}$. Εach such optimal pairwise fitting is accomplished by means of the method presented in “Optimal Matching of Any Two Realizations of the Same Alphabet Symbol” Section, where “${\mathrm{\alpha }}_{1}$” plays the role of a “prototype”-fixed realization, while ${\mathrm{\alpha }}_{\mathrm{i}}$ of the current one. Each such pairwise fitting gives rise to a corresponding scaling factor ${\uplambda }_{\mathrm{i}}^{1}$, where superscript 1 stands for the cardinal number of the fixed letter “${\mathrm{\alpha }}_{1}$” and subscript $i$ indicates the cardinal number of the current letter “${\alpha }_{i}$” optimally fit to “${\mathrm{\alpha }}_{1}$” via the method presented in “Optimal Matching of Any Two Realizations of the Same Alphabet Symbol” Section.

Step-IR.2

We repeat this procedure by successively letting all other ${\mathrm{\alpha }}_{\mathrm{i}}$ realizations in $\mathrm{PD}$, $\mathrm{i}=\mathrm{2,3},\dots ,{\mathrm{N}}^{\mathrm{\alpha }}$, play the role of the fixed realization; in this way, we obtain a double matrix ${\Lambda }_{\mathrm{j}}^{\mathrm{i}}$, having as entries the scaling factors ${\uplambda }_{\mathrm{j}}^{\mathrm{i}}$, which correspond to the optimal fitting of the letter ${\mathrm{\alpha }}_{\mathrm{j}}$ to ${\mathrm{\alpha }}_{\mathrm{i}}$, where $\mathrm{j}=\mathrm{1,2},3,\dots ,{\mathrm{N}}^{\mathrm{\alpha }}$, with $\mathrm{j}\ne \mathrm{i}$.

We confine ourselves to those entries of matrix ${\Lambda }_{\mathrm{j}}^{\mathrm{i}}$, which correspond to scaling factors ${\uplambda }_{\mathrm{j}}^{\mathrm{i}}$ belonging in the interval $\left[\mathrm{0.55,1.8}\right]$; we would like to emphasize that all alphabet symbol realizations appearing in the documents treated in the present work, as well as in Byzantine codices [4], do belong in this class. All the same, if a larger scaling factor interval is necessary to be considered, we firmly believe that the introduced approach remains totally applicable. Eventually, for each realization ${\mathrm{\alpha }}_{\mathrm{i}}$, when it plays the role of the prototype-fixed one, we obtain a bundle ${B}^{\alpha ,i}$, consisting of all letter realizations ${\mathrm{\alpha }}_{\mathrm{j}}$ optimally fit to ${\mathrm{\alpha }}_{\mathrm{i}}$ (see Fig. 11). We repeat with emphasis that the aforementioned optimal matching is achieved using the contours ${C}^{\alpha ,i}$ of each realization ${\mathrm{\alpha }}_{\mathrm{i}}$, as described in “Optimal Matching of Any Two Realizations of the Same Alphabet Symbol” Section.

Step-IR.3

We consider that the border ${C}^{\alpha ,i}$ of an arbitrary realization ${\alpha }_{i}$, is a union of closed simple Jordan curves, enumerated in exactly the same way that will be immediately described: ${\Gamma }_{1}^{\alpha ,i}$ is the closed outer contour of ${\alpha }_{i}$, ${\Gamma }_{2}^{\alpha ,i}$ is an eventual first internal contour, if exists, ${\Gamma }_{3}^{\alpha ,i}$ an eventual third internal contour, etc. We stress that, for a specific alphabet symbol, all ${\Gamma }_{\rho }^{\alpha ,i}, \rho =\mathrm{1,2},3,\dots$ curves are topologically analogous.

Consider the arbitrary realization ${\alpha }_{i}$, with contour ${C}^{\alpha ,i}$, which momentarily plays the role of the prototype-fixed curve; for example, in Fig. 11c, the bundle ${B}^{\theta ,i}$ is presented, where the prototype-fixed contour of ${\theta }_{i}$ is shown with a wider line. The ${C}^{\theta ,i}$ contour consists of three (3) closed simple curves ${\Gamma }_{\rho }^{\theta ,i}$, where $\rho =\mathrm{1,2},3$, presented with wider blue, magenta, and green lines respectively. As a consequence, bundle ${B}^{\theta ,i}$ includes three (3) sub-bundles $B{\Gamma }_{\rho ,j}^{\theta ,i}, \rho =\mathrm{1,2},3$, where subscript j expresses the cardinal number of the arbitrary $j-th$ contour ${C}^{\theta ,i}$. Evidently, for each letter α and momentarily prototype realization ${\alpha }_{i}$, bundle ${B}^{\alpha ,i}$ is defined in an analogous manner, and it is the union of sub-bundles $B{\Gamma }_{\rho ,j}^{a,i}$, $j=\mathrm{1,2},\dots ,{N}^{\alpha }$.

For all elements of bundle ${B}^{\alpha ,i}$ and for each contour curve, say $B{\Gamma }_{\rho ,j}^{\alpha ,i},\uprho =\mathrm{1,2},...,$ we evaluate the number of pixels ${\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}$, where subscript j expresses the cardinal number of each “current” realization ${\alpha }_{j}$ which is optimally fit to ${\mathrm{\alpha }}_{\mathrm{i}}$. Among all curves of the considered sub-bundle $B{\Gamma }_{\rho }^{\alpha ,i}$, in connection with each realization ${\mathrm{a}}_{\mathrm{j}} \in {\mathrm{B}}^{\alpha ,i}$, we choose the one which is nearest to the integer part $\left[\frac{1}{3}*\mathrm{median}\left({\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\right)\right],\mathrm{ j}=\mathrm{1,2},\dots ,{\mathrm{N}}^{\mathrm{\alpha }},\mathrm{ i}\ne \mathrm{j}$; we symbolize this letter realization as ${\mathrm{\alpha }}_{\mathrm{i},\mathrm{M}}$, consisting of ${\mathrm{L}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },i}$ pixels (see Fig. 11). Subsequently, and always in connection with each ${\Gamma }_{\rho }^{\alpha ,i}$ separately, we proceed as follows:

i.
we keep those realizations of sub-bundle $B{\Gamma }_{\rho }^{\alpha ,i}$, the number of pixels of which is equal to or greater than ${\mathrm{L}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },i}$; for the collection of these optimally fit sub-contours, we use the symbol ${\mathrm{RB}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },\mathrm{i}}$, including, say ${\mathrm{N}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },i}$ elements, where symbol RB stands for Restricted Bundle.
ii.
For each element (contour) ${\Gamma }_{\rho ,j}^{\alpha ,i}\in {\mathrm{RB}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },\mathrm{i}}, j>M$, we compute the integer part of ${\mathrm{p}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}$, were ${\mathrm{p}}_{\uprho ,\mathrm{j}}^{\mathrm{\alpha },i}=\frac{{\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },\mathrm{i}}}{{\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}-{\mathrm{L}}_{\uprho ,\mathrm{M}}^{\mathrm{\alpha },i}}$.
iii.
In connection with each ${\Gamma }_{\rho ,j}^{\alpha ,i}\in {\mathrm{RB}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },\mathrm{i}}$, we remove the pixel with cardinal number $\left({\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },\mathrm{i}}-{\mathrm{p}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\right)$ from the considered contour, bringing its two adjacent pixels in contact, so as the final digital curve “has no holes”. Next, we remove the pixel with cardinal number $\left({\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },\mathrm{i}}-2{\mathrm{p}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\right)$, from the very same contour ${\Gamma }_{\rho ,j}^{\alpha ,i}$, bridging the resulting gap once more and so on.
iv.
We continue removing pixels $\left({\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },\mathrm{i}}-m*{\mathrm{p}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\right), m=3, 4, \dots$ from ${\Gamma }_{\rho ,j}^{\alpha ,i}$, following the procedure of step iii. The process is terminated at the $\mathrm{r}-\mathrm{th}$ removal, provided that $\left({\mathrm{L}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },\mathrm{i}}-\left(\mathrm{r}+1\right)*{\mathrm{p}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\right)<0$ holds.
v.
We close the digital curve, say ${R\Gamma }_{\rho ,j}^{\alpha ,i}$, obtained by this method, by connecting the last and first pixel of ${R\Gamma }_{\rho ,j}^{\alpha ,i}$.

We emphatically repeat that by the end of this procedure, all reduced members of ${\mathrm{RB}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },\mathrm{i}}$ have all contours ${R\Gamma }_{\rho ,j}^{\alpha ,i}$ with the same number of pixels ${\mathrm{L}}_{\rho }^{\mathrm{\alpha },\mathrm{i}}$; needless to say, that, as a rule, the number of pixels ${\mathrm{L}}_{\rho }^{\mathrm{\alpha },\mathrm{i}}$ concerning a specific contour ${R\Gamma }_{\rho ,j}^{\alpha ,i}$ depends on ${\Gamma }_{\rho ,M}^{\alpha ,i}$ (see Fig. 12). For this new version of the optimally fit realizations’ contours, we shall employ the symbol ${\mathrm{RB}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },\mathrm{i}}$, were the additional letter E in this symbolism, expresses the fact that all members of bundle ${\mathrm{RB}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },\mathrm{i}}$ consist of digital curves with the same number of pixels. In a straightforward extension, we shall employ the symbol ${\mathrm{\alpha }}_{\rho ,j}^{\mathrm{RE},\mathrm{i}}$ for each member of this restricted bundle.

Step-IR.4

For each realization ${R\Gamma }_{\rho ,j}^{\alpha ,i}\in {\mathrm{R{\rm B}}}_{\rho ,j}^{\mathrm{\alpha },i}$, we accomplish the following:

a) We evaluate the arc length ${\mathrm{s}}_{\rho ,\mathrm{j}}^{\alpha ,i}\left(\mathrm{k}\right)$ for all pixels $k$ of ${R\Gamma }_{\rho ,j}^{\alpha ,i}$, in connection with the arbitrary-generic alphabet symbol “α”; we emphasize that length ${\mathrm{s}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\left(\mathrm{k}\right)$ is evaluated recursively, starting from the origin of this discrete curve, i.e. from $k=1$. We recall that, so far, the origin has been arbitrarily generated from the process of image segmentation and contour extraction of each symbol realization.
b) We optimally approximate each contour ${R\Gamma }_{\rho ,j}^{\alpha ,i}$ with two polynomials for the $x$ and $y$ coordinates of each pixel $k$ of it, as a function of the arc length${\mathrm{s}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\left(\mathrm{k}\right)$. The degree of the polynomials has been chosen to be twenty-one (21) for the external (or the entire) contour ${R\Gamma }_{1,j}^{\alpha ,i}$ and of eleven (11) degree, for the internal contours ${R\Gamma }_{\rho ,j}^{\alpha ,i}, \rho =\mathrm{2,3},\dots$. These degrees have proved to be very efficient and have offered pretty small error-distance and fluctuations, in connection with all studied alphabet symbols (see Fig. 13).
c) We have computed the curvature of the aforementioned approximating polynomials and we have considered it to be the curvature of the actual contour at the specific point (see Fig. 14a).
d) We have optimally fit the obtained sequences of curvatures for all ${R\Gamma }_{\rho ,j}^{\alpha ,i} \in {\mathrm{R{\rm B}}}_{\rho ,j}^{\mathrm{\alpha },i}$, for a specific $\rho$ and $j=1, 2, \dots , {\mathrm{N}}_{\rho ,\mathrm{M}}^{\mathrm{\alpha },i}$. This optimal fit has been realized by the quite standard method of shifting the curvature sequence of each ${R\Gamma }_{\rho ,j}^{\alpha ,i}$ realization, evaluating each distance from the prototype one and keeping the position that offers the minimum error distance (see Fig. 14b).
e) Finally, we have re-enumerated all pixels $k$ of every considered contour ${R\Gamma }_{\rho ,j}^{\alpha ,i}$, so as pixels with the same cardinal number belong to the same class (see Figs. 15, 16).

Step-IR.5

After accomplishing re-enumeration of all pixels for each contour of sub-bundle ${\mathrm{RB}}_{\rho ,\mathrm{M}}^{\alpha ,i}$ and for each $\rho$ separately, we have evaluated the mean value of the coordinates $\left({\mathrm{x}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}, {\mathrm{y}}_{\rho ,\mathrm{j}}^{\mathrm{\alpha },i}\right)$, in connection with all pixels sharing the same cardinal number $k$ (see Fig. 17). Namely for each $\rho$-sub-bundle consisting of equinumerous contours, we compute the average value of the coordinates of all pixels that share the same cardinal number. In this way, one obtains an average curve for each contours’ sub-bundle ${R{\rm B}}_{\rho ,{\rm M}}^{\alpha ,i}$, which we symbolize as $F{V}_{\rho }^{\alpha ,i}$ (FV stands for “First Version of the ideal representative”) (see Fig. 17).

Step-IR.6

We repeat steps IR.1 to IR.5 above, by letting all other realizations ${\mathrm{\alpha }}_{\mathrm{j}}, j\ne i$ appearing in PD, play the role of the prototype letter. In this way, we end up with a number ${\mathrm{N}}^{\mathrm{\alpha }}$, of average curves $F{V}^{\alpha ,i}=\cup F{V}_{\rho }^{\alpha ,i}, \rho =\mathrm{1,2},\dots$, which we have already called “first versions of ideal representatives”.

Step-IR.7

We repeat steps IR.1 to IR.5 by letting $F{V}^{\alpha ,i}$ evaluated in step IR.6, play the role of the contours ${C}^{\alpha ,i}$ of all realizations. Among all the resulting bundles of optimally fit digital curves $F{V}^{\alpha ,i}$ we choose the one with the minimum overall fitting error. Let the corresponding average curve be ${\Pi }^{\alpha }=\bigcup {\Pi }_{\rho }^{\alpha }, \rho =\mathrm{1,2},\dots$, where superscript $\alpha$ refers to the letter in hand and subscript $\uprho$ to its contours (the external one and the internals, if any). We assume that this “final” union of curves ${\Pi }^{\alpha }$ is the “ideal” or “platonic” representative of alphabet symbol $\alpha$, for the specific document part $\mathrm{PD}$ (see Fig. 18).

Identification of handwriting based on the ideal representatives

The method previously introduced has been applied to all selected alphabet symbols, namely, “α”, “β”, “γ”, “ε”, “θ”, “λ”, “μ”, “π”, “ρ”, “σ”, “ω”, for the reasons described in “A Summary and the Novelties of the Introduced Methodology” Section. In connection with each one of these symbols, we have extracted the ideal representatives associated with FYSAP, SAGAN, FILIA, SYNTAG and ELVEN. As a next step, we have applied the similarity criterion analytically discussed in “Optimal Matching of Any Two Realizations of the Same Alphabet Symbol” Section, to all pairs of “platonic prototypes”, corresponding to the aforementioned alphabet symbols (see Figs. 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29). The obtained comparison results are excellent as Figs. 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 manifest. We emphasize that in these Figures, we have adopted the following convention:

A)
The contour of letters of FYSAP are always depicted in blue, while the border lines of SAGAN in cyan, the contour of FILIA in green and those of SYNTAG in red.
B)
The pixels of the union of the internals of two optimally fit ideal representatives are shown in degradations of grey, while the points (pixels) of their intersection is always shown in black.

From Figs. 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, we feel that the optical representation of the matching of two ideal representatives of the same alphabet symbol, offers a very good indication if the letter considered each time comes from the same hand or not. However, in addition, the similarity criterion introduced in “Optimal Matching of Any Two Realizations of the Same Alphabet Symbol” Section, applied to any two optimally fit platonic prototypes will be given as a percentage, so as to achieve a clearer, quantitative understanding of the comparison results.

Figures 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 above, together with the corresponding similarity criteria, constitute very strong evidence that documents FYSAP, SAGAN and FILIA have been written by the same hand and in particular by that of Rigas Feraios, while document SYNTAG has not. Given that the content of SYNTAG undoubtedly belongs to Rigas Feraios, it results that the treated document of SYNTAG is a copy made by another hand.

We would like to emphasize that an analogous clear-cut, visual demonstration also holds true in connection with the platonic prototypes’ comparison of FYSAP with ELVEN, as well as SAGAN and FILIA with ELVEN. In fact, all these optimal matches clearly, visually manifest that ELVEN has not been written by Rigas Feraios; actually, the discrepancies among the ideal representatives of FYSAP, SAGAN and FILIA in one hand and ELVEN on the other, are more evident than the corresponding discrepancies concerning SYNTAG.

A statistical approach for identifying Rigas Feraios’ handwriting

The serious problem concerning the verification if a number of texts belong to Rigas Feraios or not is the fact that there is only one document for which we are historically certain that it has been written by the hand of this great personality. Indeed, the only document that has been definitely written by Rigas’ hand is FYSAP (“Φυσικής Απάνθισμα”, “Compilation of Physics”). As a consequence, we cannot obtain statistical measure of similarity, so reliable as it would have been resulted after the comparison of SAGAN and of FILIA with two or even more documents undoubtedly written by Rigas’ hand.

To circumvent this difficulty as much as possible, we proceeded as follows:

i.
consider any alphabet symbol, say “ε”, a statistically sufficient number of realizations of which, say ${N}^{\varepsilon }$, are found in FYSAP. At a first step, we divide these ${N}^{\varepsilon }$ realizations into groups of nine (9) “ε” realizations randomly chosen from FYSAP. For each one of the selected 9-tuples we apply the method introduced in “The “Ideal Representative” of an Alphabet Symbol, Concerning a Document of a Specific Writer” Section and we extract an ideal representative of the realizations belonging to the first 9-tuple, which we symbolize as “${\upvarepsilon }_{1,i}^{9}$”, where subscript 1 symbolizes the fact that it is the first division in 9-tuples, while the second subscript “i” stands for the cardinal number of the ideal representative of the $(i-th)$ 9-tuple of the current division. We emphasize again that the entire ensemble of “ε” realizations in FYSAP give rise to various, random divisions of them in 9-tuples.
ii.
we optimally match all pairs of these platonic prototypes $\left({\upvarepsilon }_{1,\mathrm{i}}^{9}, {\upvarepsilon }_{1,\mathrm{j}}^{9}\right)$, where $\mathrm{i},\mathrm{ j}=2,\dots , \left[\frac{{\mathrm{N}}^{\upvarepsilon }}{9}\right]$ with $\mathrm{i}\ne \mathrm{j}$. Let $\mathrm{m}= \left[\frac{{\mathrm{N}}^{\upvarepsilon }}{9}\right]$, i. e. the integer part of $\frac{{\mathrm{N}}^{\upvarepsilon }}{9}$; then, the aforementioned optimal matches give rise to $\frac{\mathrm{m}(\mathrm{m}-1)}{2}$ similarity criteria ${\mathrm{SC}}_{1}^{\upvarepsilon 9}\left(\mathrm{i},\mathrm{j}\right)$, with $\mathrm{i}\ne \mathrm{j}$, as evaluated in “A Summary and the Novelties of the Introduced Methodology” Section.
iii.
We repeat steps 1 and 2 above for three (3) more randomly chosen divisions of ${N}^{\varepsilon }$ “ε” realizations into 9-tuples, thus obtaining three new sets of similarity criteria ${\mathrm{SC}}_{2}^{\upvarepsilon 9}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{3}^{\upvarepsilon 9}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{4}^{\upvarepsilon 9}\left(\mathrm{i},\mathrm{j}\right)$.
iv.
We reapply the three previous steps, dividing all ${N}^{\varepsilon }$ realizations of “ε” into 10-tuples; in this way, we obtain four (4) sets of similarity criteria, ${\mathrm{SC}}_{1}^{\upvarepsilon 10}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{2}^{\upvarepsilon 10}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{3}^{\upvarepsilon 10}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{4}^{\upvarepsilon 10}\left(\mathrm{i},\mathrm{j}\right)$, with $\mathrm{i},\mathrm{j}=1, \dots ,\left[\frac{{\mathrm{N}}^{\upvarepsilon }}{10}\right],\mathrm{ i}\ne \mathrm{j}$.
v.
We repeat step iv after dividing the “ε” realizations into 12-tuples, thus obtaining the sets of similarity criteria ${\mathrm{SC}}_{1}^{\upvarepsilon 12}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{2}^{\upvarepsilon 12}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{3}^{\upvarepsilon 12}\left(\mathrm{i},\mathrm{j}\right)$, ${\mathrm{SC}}_{4}^{\upvarepsilon 12}\left(\mathrm{i},\mathrm{j}\right)$, with $\mathrm{i},\mathrm{j}=1, \dots ,\left[\frac{{\mathrm{N}}^{\upvarepsilon }}{12}\right]$, always with $\mathrm{i}\ne \mathrm{j}$.
vi.
We, once more, apply the aforementioned step, for four (4) random divisions of ${\mathrm{N}}^{\upvarepsilon }$ realizations into two (2) practically equinumerous sets, having $\left[\frac{{\mathrm{N}}^{\upvarepsilon }}{2}\right]$ distinct elements in number. In other words, we randomly choose practically half the realizations of “ε”, we estimate their ideal representative, we let it play the role of the prototype letter and we repeat this process for the other half, which we consider to be the current letter. We apply this approach for three (3) additional random divisions of ${\mathrm{N}}^{\upvarepsilon }$ into two (2) halves, in practice; consequently, we obtain four (4) similarity criteria${\mathrm{SC}}_{\mathrm{k}}^{\mathrm{H\varepsilon }},\mathrm{ k}=1, .., 4$.It is logical to assume that there is an intimate relation between the number of elements of a n-tuple each time employed in one hand and of the obtained similarity measures on the other. Indeed, for example, one may expect that nine (9) realizations of the same alphabet symbol, say “ε”, convey less information of the overall handwriting style. The 10-tuples convey “a bit greater amount of related information”, the 12-tuples a little greater too, while half of the “ε” realizations appearing in the same document convey even greater amount of information; evidently, the maximum information is obtained, when all realizations of the tested alphabet symbol in the studied document are taken into consideration. Thus, for example, in connection with letter “λ” that has 66 appearances in various pages of FYSAP, a typical sequence of corresponding similarity criteria is presented in Table 1.
vii.
We reapply steps i to vi above for all selected alphabet symbols, presented in “A Summary and the Novelties of the Introduced Methodology” Section.

Table 1 Manifestation of the way the values of the similarity criterion, between ideal representatives of n-tuples, depend on n

Full size table

Following this approach, for each selected Greek alphabet symbol, separately, we end up with a minimum $S{C}_{min}^{\alpha }\left(FYSAP\right)$ and a maximum value $S{C}_{max}^{\alpha }\left(FYSAP\right)$ of all the aforementioned similarity criteria associated with the document “Φυσικής Απάνθισμα”, which is historically certain that it has been written by Rigas’ hand. We assume that if the similarity criterion of FYSAP with a handwritten document, say PD, lies inside interval $\left[S{C}_{min}^{\alpha }\left(FYSAP\right),S{C}_{max}^{\alpha }\left(FYSAP\right)\right]$, then this is a measure indicating that the unknown document PD has been written by Rigas as far as the considered alphabet symbol is concerned.

If for all selected alphabet symbols, it holds that $S{C}^{\alpha }(FYSAP,PD)\in \left[S{C}_{min}^{\alpha }\left(FYSAP\right),S{C}_{max}^{\alpha }\left(FYSAP\right)\right]$, then we deduce that PD has been written by Rigas Feraios. Otherwise, we may make a statistical estimation of the likelihood that a document, say PD, has not been written by the hand of Rigas by first evaluating the similarity criterion of each letter’s platonic prototype from the corresponding one of FYSAP and at a second step, by computing the distance of this similarity criterion from the aforementioned intervals (see Table 2).

Table 2 Similarity Criteria of the Ideal Representatives

Full size table

Conclusion

In the present work, the notion and a method for evaluation of the “Ideal Representative” (or “Platonic Prototype”) are introduced, in connection with any alphabet symbol appearing in an arbitrary document part, say PD. We have shown that this ensemble of curves very well represents the associated ideal alphabet symbol that a writer had in his mind when was writing PD by hand. The stimulus for this approach was to give an answer to the important (for many nations) question if two handwritten documents discovered in Romania in 1998 by Chisacof and in particular the “Saganaki of Madness” (SAGAN) and the “Tested Friendship” (FILIA) have been written by the hand of Rigas Feraios.

Thus, we have explicitly evaluated and compared the Ideal Representatives of eleven (11), properly selected, alphabet symbols, associated with the aforementioned two documents, SAGAN and FILIA, but also with “Compilation of Physics” (FYSAP), which from the historical point of view, has been undoubtedly written by the hand of Rigas Feraios, “Constitution of the Greek State” (SYNTAG) and a document (ELVEN) written by another important Greek Politician, namely Eleftherios Venizelos. Therefore, we have reached the conclusion that the two newly discovered documents SAGAN and FILIA have indeed been written by the hand of Rigas Feraios, while the celebrated “Constitution of the Greek State” has not, although its content undoubtedly belongs to Rigas. Moreover, the Ideal Representatives of the eleven (11) selected letters obtained from document ELVEN manifest serious and statistically important differences from the three documents belonging to Rigas.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

SC:: Similarity criterion

References

“Home,” National Research Foundation “Eleftherios K. Venizelos,” 2022. https://www.venizelos-foundation.gr/en/
Mackridge P. Lia Brad Chisacof (ed.) Ρήγας. ανέκδοτα κείμενα. athens: panepistimiakes ekdoseis kyprou and ekdoseis gutenberg, 2011. Pp. 364. Byzantine Mod Greek Stud. 2016;37(2):295–7. https://doi.org/10.1017/S030701310000687X.
Article Google Scholar
Panagopoulos M, Papaodysseus C, Rousopoulos P, Dafi D, Tracy S. Automatic writer identification of ancient greek inscriptions. IEEE Trans Pattern Anal Mach Intell. 2009;31(8):1404–14. https://doi.org/10.1109/TPAMI.2008.201.
Article Google Scholar
Arabadjis Dd, Giannopoulos F, Panagopoulos M, Exarchos M, Blackwell C, Papaodysseus C. A general methodology for identifying the writer of codices. application to the celebrated ‘twins.’ J Cult Herit. 2019;39:186–201. https://doi.org/10.1016/j.culher.2019.04.002.
Article Google Scholar
Schomaker L, Bulacu M. Automatic writer identification using connected-component contours and edge-based features of uppercase Western script. IEEE Trans Pattern Anal Mach Intell. 2004;26(6):787–98. https://doi.org/10.1109/TPAMI.2004.18.
Article Google Scholar
Hannad Y, Siddiqi I, El Kettani MEY. Writer identification using texture descriptors of handwritten fragments. Expert Syst Appl. 2016;47:14–22. https://doi.org/10.1016/j.eswa.2015.11.002.
Article Google Scholar
Zhang X, Nagy G. Computational method for calligraphic style representation and classification. J Electron Imaging. 2015;24(5):053003. https://doi.org/10.1117/1.JEI.24.5.053003.
Article Google Scholar
Newell AJ, Griffin LD. Writer identification using oriented basic image features and the Delta encoding. Pattern Recognit. 2014;47(6):2255–65. https://doi.org/10.1016/j.patcog.2013.11.029.
Article Google Scholar
Bensefia A, Paquet T. Writer verification based on a single handwriting word samples. EURASIP J Image Video Process. 2016;2016(1):34. https://doi.org/10.1186/s13640-016-0139-0.
Article Google Scholar
He S, Samara P, Burgers J, Schomaker L. Historical manuscript dating based on temporal pattern codebook. Comput Vis Image Underst. 2016;152:167–75. https://doi.org/10.1016/j.cviu.2016.08.008.
Article Google Scholar
Sarlin P. Self-organizing time map: an abstraction of temporal multivariate patterns. Neurocomputing. 2013;99:496–508. https://doi.org/10.1016/j.neucom.2012.07.011.
Article Google Scholar
He S, Schomaker L. Writer identification using curvature-free features. Pattern Recognit. 2017;63:451–64. https://doi.org/10.1016/j.patcog.2016.09.044.
Article Google Scholar
Faigenbaum-Golovin S, et al. Algorithmic handwriting analysis of Judah’s military correspondence sheds light on composition of biblical texts. Proc Natl Acad Sci. 2016;113(17):4664–9. https://doi.org/10.1073/pnas.1522200113.
Article CAS Google Scholar
Diem M, Kleber F, Fiel S, Sablatnig R. Semi-automated document image clustering and retrieval. Document Recognit Retr. 2014;9021:206–15. https://doi.org/10.1117/12.2043010.
Article Google Scholar
Shweka R, Choueka Y, Wolf L, Dershowitz N. Automatic extraction of catalog data from digital images of historical manuscripts. Lit Linguist Comput. 2013;28(2):315–30. https://doi.org/10.1093/llc/fqt007.
Article Google Scholar
Wolf L, et al. Identifying join candidates in the cairo genizah. Int J Comput Vis. 2011;94(1):118–35. https://doi.org/10.1007/s11263-010-0389-8.
Article Google Scholar
He S, Wiering M, Schomaker L. Junction detection in handwritten documents and its application to writer identification. Pattern Recognit. 2015;48(12):4036–48. https://doi.org/10.1016/j.patcog.2015.05.022.
Article Google Scholar
De Stefano C, Maniaci M, Fontanella F, Scotto di Freca A. Reliable writer identification in medieval manuscripts through page layout features: the ‘avila’ bible case. Eng Appl Artif Intell. 2018;72:99–110. https://doi.org/10.1016/j.engappai.2018.03.023.
Article Google Scholar
Cilia ND, De Stefano C, Fontanella F, Marrocco C, Molinara M, di Freca AS. An experimental comparison between deep learning and classical machine learning approaches for writer identification in medieval documents. J Imaging. 2020;6(9):9. https://doi.org/10.3390/jimaging6090089.
Article Google Scholar
Zhang P. RSTC: a new residual swin transformer for offline word-level writer identification. IEEE Access. 2022;10:57452–60. https://doi.org/10.1109/ACCESS.2022.3178597.
Article Google Scholar
Abbas F, Gattal A, Djeddi C, Siddiqi I, Bensefia A, Saoudi K. Texture feature column scheme for single- and multi-script writer identification. IET Biom. 2021;10(2):179–93. https://doi.org/10.1049/bme2.12010.
Article Google Scholar
GR-RNN. Global-context residual recurrent neural networks for writer identification”. Pattern Recognit. 2021;117:107975. https://doi.org/10.1016/j.patcog.2021.107975.
Article Google Scholar
Semma A, Hannad Y, Siddiqi I, Djeddi C, El El Youssfi Kettani M. Writer identification using deep learning with fast keypoints and harris corner detector. Expert Syst Appl. 2021;184:115473. https://doi.org/10.1016/j.eswa.2021.115473.
Article Google Scholar
Diamantatos P, Kavallieratou E, Gritzalis S. Directional hinge features for writer identification: the importance of the skeleton and the effects of character size and pixel intensity. SN Comput Sci. 2021;3(1):56. https://doi.org/10.1007/s42979-021-00950-9.
Article Google Scholar
Javidi M, Jampour M. A deep learning framework for text-independent writer identification. Eng Appl Artif Intell. 2020;95:103912. https://doi.org/10.1016/j.engappai.2020.103912.
Article Google Scholar
He S, Schomaker L. FragNet: writer identification using deep fragment networks. IEEE Trans Inf Forensics Secur. 2020;15:3013–22. https://doi.org/10.1109/TIFS.2020.2981236.
Article Google Scholar
Bulacu M, Schomaker L. Text-independent writer identification and verification using textural and allographic features. IEEE Trans Pattern Anal Mach Intell. 2007;29(4):701–17. https://doi.org/10.1109/TPAMI.2007.1009.
Article Google Scholar
A. Bensefia and C. Djeddi, “Relevance of Grapheme’s Shape Complexity in Writer Verification Task,” in 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI), 2020, https://doi.org/10.1109/IRI49571.2020.00016
Chahi A, El Merabet Y, Ruichek Y, Touahni R. WriterINet: a multi-path deep CNN for offline text-independent writer identification. Int J Doc Anal Recognit IJDAR. 2022. https://doi.org/10.1007/s10032-022-00418-3.
Article Google Scholar
Cilia ND, De Stefano C, Fontanella F, Marrocco C, Molinara M, Scotto Di Freca A. An end-to-end deep learning system for medieval writer identification. Pattern Recognit Lett. 2020;129:137–43. https://doi.org/10.1016/j.patrec.2019.11.025.
Article Google Scholar
Chen Z, Yu H-X, Wu A, Zheng W-S. Letter-level online writer identification. Int J Comput Vis. 2021;129(5):1394–409. https://doi.org/10.1007/s11263-020-01414-y.
Article Google Scholar
Adak C, Chaudhuri BB, Lin C-T, Blumenstein M. Intra-variable handwriting inspection reinforced with idiosyncrasy analysis. IEEE Trans Inf Forensics Secur. 2020;15:3567–79. https://doi.org/10.1109/TIFS.2020.2991833.
Article Google Scholar
Lai S, Zhu Y, Jin L. Encoding pathlet and SIFT features with bagged VLAD for historical writer identification. IEEE Trans Inf Forensics Secur. 2020;15:3553–66. https://doi.org/10.1109/TIFS.2020.2991880.
Article Google Scholar
Vásquez JL, Ravelo-García AG, Alonso JB, Dutta MK, Travieso CM. Writer identification approach by holistic graphometric features using off-line handwritten words. Neural Comput Appl. 2020;32(20):15733–46. https://doi.org/10.1007/s00521-018-3461-x.
Article Google Scholar
Khosroshahi SNM, Razavi SN, Sangar AB, Majidzadeh K. Deep neural networks-based offline writer identification using heterogeneous handwriting data: an evaluation via a novel standard dataset. J Ambient Intell Humaniz Comput. 2022;13(5):2685–704. https://doi.org/10.1007/s12652-021-03253-2.
Article Google Scholar
Bar-Yosef I, Beckman I, Kedem K, Dinstein I. Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. Int J Doc Anal Recognit IJDAR. 2007;9(2):89–99. https://doi.org/10.1007/s10032-007-0041-5.
Article Google Scholar
Nomura S, Yamanaka K, Katai O, Kawakami H, Shiose T. A novel adaptive morphological approach for degraded character image segmentation. Pattern Recognit. 2005;38(11):1961–75. https://doi.org/10.1016/j.patcog.2005.01.026.
Article Google Scholar
Papaodysseus C, Exarhos M, Panagopoulos M, Rousopoulos P, Triantafillou C, Panagopoulos T. Image and pattern analysis of 1650 B.C. wall paintings and reconstruction. IEEE Trans Syst Man Cybern Part Syst Hum. 2008;38(4):4. https://doi.org/10.1109/TSMCA.2008.923078.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

School of Electrical & Computer Engineering, National Technical University of Athens, Iroon Polytechniou 9, Zografou, 15780, Athens, Greece
Athanasios Rafail Mamatsis, Eirini Mamatsi, Constantinos Chalatsis, Pandora Kampouri & Constantin Papaodysseus
School of Engineering, University of West Attica, Petrou Ralli & Thivon 250 Egaleo, 12241, Athens, Greece
Dimitris Arabadjis

Authors

Athanasios Rafail Mamatsis
View author publications
You can also search for this author inPubMed Google Scholar
Eirini Mamatsi
View author publications
You can also search for this author inPubMed Google Scholar
Constantinos Chalatsis
View author publications
You can also search for this author inPubMed Google Scholar
Dimitris Arabadjis
View author publications
You can also search for this author inPubMed Google Scholar
Pandora Kampouri
View author publications
You can also search for this author inPubMed Google Scholar
Constantin Papaodysseus
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors read and approved the final manuscript.

Corresponding author

Correspondence to Constantin Papaodysseus.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Mamatsis, A.R., Mamatsi, E., Chalatsis, C. et al. A novel methodology for writer (hand) identification: establishing Rigas Feraios wrote two important Greek documents discovered in Romania. Herit Sci 11, 38 (2023). https://doi.org/10.1186/s40494-023-00873-z

Download citation

Received: 14 September 2022
Accepted: 27 January 2023
Published: 23 February 2023
DOI: https://doi.org/10.1186/s40494-023-00873-z