Skip to main content

An intelligent character segmentation system coupled with deep learning based recognition for the digitization of ancient Tamil palm leaf manuscripts

Abstract

Palm-leaf manuscripts, rich with ancient knowledge in areas such as history, art, and medicine, are vital cultural treasures, making their digitization essential for preserving this heritage. Digitization of these organic and fragile manuscripts is required to safeguard the essential ancient data. This requires optimal character segmentation and recognition algorithms. A limited number of studies have been carried out in Tamil character recognition in literature. Handling row-overlapped characters, noise introduced due to lightning issues, and dirt, as well as the removal of punch holes, auto-cropping the content, filtering out noisy or improper segmentation, etc. are the essential concerns carried out in our proposed work. This work is executed as a four-step process (1) Palm Leaf Manuscript Acquisition (2) Pre-Processing (3) Segmentation of Tamil Characters and (4) Tamil Character Recognition. During acquisition, the scanners are used for recording palm leaf manuscripts from the Tamil Nadu-oriented manuscript library. In the Pre-processing step, the Fast Non-Local Means (Fast-NLM) method, paired with median filtering is used for Denoising the scanner output image. Later, the pixels that make the characters and borders (i.e., the foreground) are identified using Sauvola thresholding. The proposed methodology introduces efficient techniques to remove Punch hole impressions from the pre-processed image, and to crop the written content from the edges. After pre-processing, the Segmentation of Tamil Characters is performed as a three-step process (a) Manuscript (b) Line, and (c) character segmentation, which addresses conjoined lines, partially/completely empty segmentations that are not previously addressed by existing techniques. This work introduces an Augmented HPP line-splitting algorithm that accurately segments written lines, handling wrong segmentation cases that were previously not considered by existing techniques. The system achieves an average segmentation accuracy of 98.25%, which far outperforms existing techniques. It also proposes a novel Punch hole removal algorithm that can locate and remove the punch-hole impressions in the manuscript image. This algorithm, along with the automated content cropping technique, increases recognition accuracy and eliminates any manual labor needed. These features make the proposed methodology highly suitable for real-time archaeological and historical researches that include manuscripts. All 247 letters and 12 numeric digits are analyzed and separated into 125 distinct writable characters. In our work, characters are segmented and used for recognition of all 247 letters and 12 digits in Tamil using a multi-class CNN with 125 classes, which drastically reduces the complexity of the neural network compared to having 257 output nodes. It offered a notable performance of 96.04% accuracy. As compared with existing Tamil and other character recognitions, this work is effective in essence of considering real-time images and the increased number of characters used.

Introduction

Palm leaf manuscripts are vital carriers of ancient knowledge and cultural heritage, especially significant in regions like Tamil Nadu, India [1]. These documents contain extensive information on subjects including history, art, medicine, and literature. Their preservation and digitization are essential to protect and pass down this invaluable legacy to future generations.

From the analysis of ancient manuscripts, people can learn about the culture, lifestyle, medicines, administrative procedures, etc. of the past [2, 3]. In ancient times, Sages and wise men in several parts of South East Asia inscribed their knowledge onto dried and processed palm leaves is several languages, which are susceptible to becoming fragile over time. Due to deterioration of these ancient manuscript documents, their digitization is essential [4]. Character recognition from handwritten manuscripts is language-dependent [5]. The lack of digitized databases for Tamil language palm leaf manuscripts is a major concern addressed in this work. Most handwritten manuscripts are not properly preserved and become damaged over time [6]. Figure 1 shows a sample of a handwritten palm leaf manuscript, highlighting the issue of non-straight written lines, which complicates the process of line segmentation by making it difficult to find points to split them with a straight line. The innate texture of the manuscript and the thin lines of plant fibre also pose great difficulty while thresholding.

Fig. 1
figure 1

Sample palm leaf handwritten manuscript

The challenges faced in processing palm-leaf manuscripts for digitization and character recognition are multifaceted. Firstly, image pre-processing is crucial to enhance the readability of characters, as the manuscripts often exhibit poor contrast and clarity due to their age and condition. Feature extraction is another essential step, aiming to accurately capture the unique characteristics of the characters from these ancient scripts. Additionally, advanced learning techniques are necessary to recognize the correct characters, especially given the handwritten nature of the manuscripts.

One of the primary difficulties is that character recognition from handwritten manuscripts is language-dependent, making it necessary to tailor the recognition process to specific languages. In the case of palm-leaf manuscripts, character segmentation is particularly challenging because the text rows are often not horizontally aligned. They may gradually bend either upwards or downwards, varying from one manuscript to another, which complicates the segmentation process. Furthermore, many characters are in contact with adjacent characters within the same row or with characters in neighboring rows. Even in areas where the characters do not touch, the convex hulls of entire rows often overlap, making it difficult to find precise points for segmentation. Once the segmentation is done, it is important to filter out the wrong segments that have not been handled by previous works. These include conjoined segments (segments where there are more than 1 line present), partially empty segments (where the line is present in one-half of the segment while the other half remains devoid of characters, or has some inevitable noise in it), and completely empty segments (where the segments have no characters but entirely composed of these noises). Figures 2 and 3 give clear examples of how these would look. The proposed work handles these cases using a novel Augmented HPP line segmentation technique which significantly improves the accuracy.

Fig. 2
figure 2

Segmented Conjoined (Double) lines as parts of other lines

Fig. 3
figure 3

Partly or Completely Empty segmented lines

The natural texture of palm leaves, which includes fibrous lines, also poses a challenge in distinguishing the foreground text from the background. Additionally, the manuscripts are frequently damaged by dirt, moisture, and general wear and tear, further complicating the digitization process. The presence of punch hole impressions, used traditionally for binding the manuscripts, can interfere with character recognition if not appropriately addressed, yet manually removing these artifacts is highly time-consuming.

Moreover, inconsistencies in lighting during image capture can make it challenging to achieve uniform thresholding, which is essential for isolating text from the background. Finally, manually cropping the contents of these manuscripts is impractical, especially when processing them in real-time, adding another layer of complexity to the digitization effort.

Tamil language, which has been taken as the focus of this study, has a total of 247 characters and 12 numeric digits. Training a Convolutional Neural Network (CNN) for classifying images into such a large number of classes might typically increase the complexity of the model and require an enormous amount of training data. Additionally, overfitting is a concern if there are not enough samples for each class. It is unlikely to find enough samples for each class in Tamil manuscripts as certain characters are not frequently used. To tackle this issue, the proposed system identifies a way to recognize all 259 characters (247 letters and 12 numeric digits) using just 125 classes in the CNN.

Tamil language has 12 vowels (Uyireluttu), 18 consonants (Meyyeluttu), and one special character, the 

figure a

 (aytha eluttu), and 216 (12 × 18) compound letters which are combinations of vowels and consonants. The consonants are usually written with a dot over them named puḷḷi. However, the dots are not included in most manuscripts and read correctly based on context. This makes the consonants and their compound with the letter

figure b

look exactly the same. For the compounds with vowel

figure c

, only the extra character

figure d

is written next to them. Each compound of the consonants with the vowels

figure e

have their own distinct written form, while that of

figure f

has the symbol

figure g

preceding the letters. In manuscripts, rows of

figure h

and

figure i

are both written with a

figure j

symbol in front, even though in modern script the

figure k

row is preceded by the

figure l

symbol. Similarly, the rows of

figure m

are written with the consonants sandwiched between the

figure n

and

figure o

symbols, though in the modern script, the

figure p

row is written in between

figure q

and

figure r

. The final

figure s

row is written as consonants sandwiched between

figure t

. In addition to this, there are 12 numeric digits,

figure u

(representing 1 to 10 respectively), along with

figure v

. The Tamil script was modified in 1978 to standardize all characters. Before that, certain letters did not follow the mentioned rules and had their way of being written. Table 1 depicts the 13 old characters and the new way to write them after the simplification of Tamil Script, which includes the 7 new characters (or ways to write the characters) which can be found in all Tamil manuscripts. Since manuscripts were written before this change, these characters must also be taken as separate classes.

Table 1 Characters in Tamil script before and after simplification in 1978

Hence,

$$\therefore\; Total \; number \; of \; distinct \; writable \; characters =125$$

Hence the proposed method trains the network with 125 characters, with which all the characters written in the manuscripts are recognized, significantly reducing the complexity of the network.

To preserve and digitize these manuscripts, automated systems are needed. The aim is to develop an automated system that can tackle the aforementioned obstacles and accurately recognize and transcribe these characters inscribed on palm leaves, safeguarding ancient knowledge and making it more accessible for research and analysis.

To attain the objective, a survey is conducted to identify the already existing automated systems. In addition to this, the benefits and limitations of already existing character recognition systems are also identified in the related work section.

Related works

Recently, palm-leaf character recognition was performed in languages such as Sanskrit [7], Grantha script [8], Malayalam [9], Tamil [10], Balinese, Sundanese, and Khmer scripts [11], Telugu [12] etc. For handwritten character recognition, Sánchez-DelaCruz et al. conducted a survey and discussed various machine learning algorithms with their challenges [13]. A semi-supervised model used for Indian script character recognition was developed by Jindal et al. [14]. However, the performance of deep neural networks on ancient textual restoration is comparatively much higher due to its sensitive reconstruction ability on texts [15], and deep learning offers End-to-End Offline handwritten text detection [16].

In the field of character segmentation from manuscripts, various methodologies have been explored. In their work, Surinta et al. [17] utilize horizontal and vertical projection profiles to identify split points, achieving 82.5% accuracy. However, this method struggles with overlapping convex hulls of lines. Chamchong et al. enhance the projection profile technique by considering additional parameters like the minimal histogram value of a line and average lines per manuscript, slightly improving accuracy to 82.57% [18]. Nevertheless, real-time application is impractical due to the difficulty in determining these parameters dynamically. Sathik et al. adopt a text line splicing technique, treating empty vertical spaces between letters as zones with or without obstacles. This method, which cuts obstacles at a fixed length, achieves a significantly higher accuracy of 95.84%, effectively handling overlapping characters [19].

Sabeenian et al. used CNN for character recognition and achieved 96.21%. It used only 15 characters from scanned palm leaf for recognition purposes [20]. Kumar et al. used CNN for Tamil character recognition and achieved 91% [21]. Athisayamani et al. developed B-spline-based Tamil vowel character recognition from palm-leaf manuscripts. This class-wise recognition of Tamil vowel characters achieved recognition accuracy for

figure w

to be 92%,

figure x

to be 78%,

figure y

to be 91%,

figure z

to be 97%,

figure aa

to be 67%, and

figure ab

to be 74%. The accuracy is limited for B-Spline due to difficulty during the recognition of connected and overlapped characters that exist in the manuscript [22].

Panyam et al. modelled a character recognition system from palm leaf by using two-level transform-based methods i.e. 2D-DCT + 2D-DWT. The recognition of characters is performed on three different planes of projections. In this, the recognition accuracy of the XY plane is 46.4%, the YZ plane is 96.4% and the XZ plane is 85.7%. However, this work is limited to 1232 training images and 308 testing images. In addition to this, the XY plane’s efficiency is not sufficient for the development of automated systems [23].

Sudarsan et al. proposed a novel feature extraction and character recognition method for the Malayalam language. For feature extraction, the authors combined a log-gabor filter with LBP and for classification, ResNet with LSTM was used. It achieved a performance of 95.57%. This system effectively works for uneven background colors [24].

Lakshmi et al. analysed the Telugu characters based on 3D features obtained from XY, YZ, and ZX coordinates. The character recognition was performed using the nearest neighbor classifier. In this, the recognition accuracy of the XY plane is 55%, the YZ plane is 96% and the XZ plane is 87%. However, this work is limited to 3696 training images and 924 testing images. Moreover, number of characters used for recognition is 28, which is not practical [25]. In the same manner, Sastry et al. extracted 3D features from Telugu Characters with Radon Transformation and used the Nearest neighbor classifier for recognition. It also offered less accuracy of 89% for Test Set-1 and 93% for Test Set-2 [26]. In many works, 3D features are considered for character recognition. In a majority of these works, only features from certain directions can be used to optimally recognize characters.

Other recent advancements in Telugu palm leaf manuscript recognition by Lakshmi et al. have leveraged 3D profiling and advanced feature selection techniques to improve accuracy, like using contact-type profilers has addressed traditional 2D scanning challenges, achieving 83.2% accuracy in the YZ plane by capturing depth features and eliminating background noise [27]. Further enhancement using depth information (Z-axis) and Differential Evolution has achieved recognition accuracy of 94.68%, effectively handling distortions and stains [28]. Analyzing depth information across multiple planes (XY, YZ, XZ) and optimizing feature dimensionality with Differential Evolution and PSO has resulted in 86.9% accuracy using a subset of 85 features [29]. Feature optimization techniques like PSO and DE have shown that reducing feature sets can significantly improve recognition accuracy, with DE achieving up to 89.1% accuracy using an 85-feature subset [30].

Owing to the effective performance of deep learning layers, a modified version of the multi-class CNN proposed by Jyothi et al. for palm leaf character recognition was evaluated using the MNIST and Grantha datasets. It achieved a recognition efficiency of 99.79% for MNIST and 96.3% for the Grantha dataset [31]. Similarly, CNN and its variants have been applied to recognize characters in other languages such as Malayalam, Thai Noi, Tulu, and Sudanese. The CNN used for Malayalam character recognition achieved accuracies of 86% [32] and 96% [33], while the stacked ResNet with LSTM achieved a higher recognition power of 95.57%. Inception-v3 detected 26 Thai Noi characters with an efficiency of 76.50% [34], and MobileNetV2 achieved an efficacy of 90% for the same characters [35]. CNN obtained an accuracy of 79.92% for the Tulu language [36] and 73% for the Sudanese language [37].

All existing Tamil language-based character recognition systems consider only a limited number of characters. Challenges such as limited dataset availability, accuracy of existing systems, processing speed, variations in writing style, and document dirt need to be addressed. The architecture of the proposed ancient Tamil palm leaf manuscripts recognition system and its processes is discussed in the next section.

Architecture of ancient tamil palm leaf manuscripts recognition system

The framework used for the Ancient Tamil Palm Leaf Manuscripts Recognition System is depicted in Fig. 4. It consists of 4 steps, namely (1) Tamil Palm Leaf Data Acquisition, (2) Tamil Palm Leaf Manuscript Pre-processing, (3) Segmentation of Tamil Characters, and (4) Multi-class CNN-based Tamil Character Recognition. Each of the steps are detailed below.

Fig. 4
figure 4

Framework of proposed Tamil Palm Leaf Manuscripts Recognition System

Data acquisition

Tamil Palm leaf manuscript data was collected from the Tamil Nadu Oriented Manuscript Library, with the help of flatbed scanners. In particular, “Agasthiyar Vaithiya Kaviyam 1500” manuscript dataset which contains 1500 passages written across 502 manuscript pages was collected. In addition to this, manuscript images including Ramayanam (with 213 leaves), Thruvilayadal (with 183 leaves), and other medical books (with 180 and 66 leaves) obtained from Tamil Digital Library are also used in our work to create a combined dataset made with different handwriting and spacing styles. The images of manuscripts added to the dataset vary in dimensionality, but all have a greater width than height to accommodate the manuscript image. Even manuscript images from the same book do not share the exact dimensions, and exhibit minor variations. For example, manuscripts from the first book (Agasthiyar Vaithiya Kaviyam 1500) have a dimension of roughly \(4200\times 400\) pixels (\(width\times height\)) on average. Most other images range between \(5000\times 450\) to \(6000\times 6\)00 pixels, with some of the smallest measuring only \(3100\times 400\) pixels. The primary source for the dataset (Vaithiya Kaviyam) maintains a consistent resolution of 300 dpi, while those obtained from the Tamil Digital Library have a resolution of 600 dpi. These images also differ in the number of characters they contain, and it varies with each manuscript book and the scriber. For example, the Agasthiyar Vaithiya Kaviyam manuscripts have roughly 600 characters each on average, whereas the fewest characters in a single manuscript were found in one of the medicinal books, with roughly 100 characters on average in each manuscript. Figure 5 shows a few sample images from the dataset that were taken for this study.

Fig. 5
figure 5

Samples of various manuscripts collected

Pre-processing of ancient tamil palm leaf manuscripts

Pre-processing of Ancient Tamil Palm Leaf Manuscripts is a vital module that improves the accuracy of the whole model. It is required for noise removal and also for extracting foreground characters in the manuscripts.

Fast non-local means denoising (Fast-NLMD)

Initially, the manuscript image is blurred using a Gaussian blur filter, with a kernel size of \(5\times 5\). The chosen kernel size is appropriate which gets rid of the thin horizontal lines that run horizontally along the manuscripts while maintaining a reasonable resolution of the text that exists in them. After this, Fast-NLMD is applied over the image for the removal of noise and redundancy that occurred in the palm leaf images. Equation (1) generates the NLM-filtered image pixel values \({Fast}_{NLM}\left(A\left(m\right)\right)\), which are computed by calculating the weighted average of the remaining pixels that exist in the image. In this, \(0\le w\left({N}_{m}, {N}_{n}\right)\le 1\) and \(\sum_{n\in {N}_{m}}w\left({N}_{m}, {N}_{n}\right)=1\). Here, \({N}_{m}\) is the neighborhood of mth pixel. \(w\left({N}_{m}, {N}_{n}\right)\) is the weight computed by evaluating the resemblance that exists between two patches. In this work, the search window size is set to \(21\times 21\) and patch size to \(7\times 7\). After initial settings, Eq. (2) computes the Weighted Euclidean distance between two patches \({N}_{m}\) and \({N}_{n}\) and \({N}_{i}\) signified the squared shape kernel with centre pixel ‘i’ by using the NLM method. The weight computation between two patches \({N}_{m}\) and \({N}_{n}\) is computed by Fast-NLM is mentioned in Eq. (3) in which the local patch size ‘P’, the intensity difference between pixels \(f\left({N}_{m}+P\right)-f({N}_{n}-P)\) and normalization constant Y in Eq. (4). In Eq. (3), the image is vectored into one dimension. The function \({Z}_{i}\left(p\right)\) is evaluated using Eq. (5) and contains a set of kernels ‘λ’ and ‘p’ expressed as \({N}_{n}-{N}_{m}\) and \({N}_{m}+\uplambda \) respectively. The proposed Fast-NLM algorithm consumes the time complexity of \(O({2}^{dimension})\), which reduces time and improves the noise reduction significantly. The processed image is converted to greyscale using the luminosity method given in Eq. (6).

$${Fast}_{NLM}\left(A\left(m\right)\right)=\sum_{n\in {N}_{m}}Fast\_w\left({N}_{m}, {N}_{n}\right)A(n)$$
(1)
$$w\left({N}_{m}, {N}_{n}\right)={\frac{1}{Y\left(m\right)}e}^{{||A\left({N}_{m}\right)-A({N}_{n})||}_{2,sd}^{2}}$$
(2)
$$Fast\_w\left({N}_{m}, {N}_{n}\right)=\frac{1}{Y\left(m\right)}{Z}_{i}(f\left({N}_{m}+P\right)-f({N}_{n}-P))$$
(3)
$$Y\left(m\right)=\sum_{n}{e}^{-\frac{{||A\left({N}_{m}\right)-A({N}_{n})||}_{2,sd}^{2}}{{d}^{2}}}$$
(4)
$${Z}_{i}\left(p\right)=\sum_{\tau =0}^{p}{e}^{-\frac{{||f\left(\tau \right)-f(\tau -\lambda )||}_{2}^{2}}{{d}^{2}}}$$
(5)
$${Gray}_{luminosity}=0.21\text{ R }+ 0.72\text{ G }+ 0.07\text{ B}$$
(6)

Median filtering

The grayscale palm leaf images are still affected by salt and pepper noises. In order to remove them, median filtering is used in this work with a kernel size 3. In this, the median value of neighboring pixels is used to replace the centre pixel value, as depicted in Eq. (7), where NE represents the number of neighbors of the pixel. It also removes any existing thin lines that run along the manuscripts.

$${{Median}_{neighbors}}=\frac{{NE/2}+\left({NE/2}+1\right)}{2}$$
(7)

Sauvola thresholding

Sauvola is a type of adaptive thresholding method that can produce commendable results when employed on palm leaves. Hence it is used for identifying the maximum dense shape of letters that exists in the palm leaf manuscripts. It highlights the inner letters of the manuscripts and always returns a binarized image. It works better if the manuscripts are captured in bad illuminating conditions. Sauvola thresholding for each pixel position \(ST\left(a,b\right)\) is computed for a complete window using Eq. (8) in which mean ‘m’, standard deviation \(\sigma \), parameter x which considers positive values near to the value of 0.5, R is the max value of \(\sigma \) i.e. 128 for the grayscale document. Here window size highlights the character region of palm leaf manuscripts.

$$ST\left(a,b\right)=m\left(a,b\right)*\left[1+x\left(\frac{\sigma (a,b)}{R}-1\right)\right]$$
(8)

The results of various existing thresholding techniques on the manuscript data have been depicted in Fig. 6. It is clear that Sauvola produces better foreground vs background separation with minimal noise when compared to other techniques.

Fig. 6
figure 6

Comparison of various thresholding techniques applied on the manuscripts

Contour based image cropping

The binarized images obtained from the output of Sauvola thresholding are considered for cropping. Because the edge and borders of the Palm Leaf Manuscripts must be removed. These edges might sometimes be battered and jagged, rendering the traditional document edge detection algorithms useless. Thus, it is necessary to crop the contents of the Manuscripts.

For generating the optimal cropping, contour detection is used for identifying the outermost object which is the document’s edge. Each of the contours are made of two points \((x,y)\) and \((x+w,y+h)\), which make the top left and the bottom right vertices of the rectangle that encloses the contour. In these, the x-coordinate represents row number and y-coordinate represents column number. Hence, \(w\) and \(h\) are the height and width of the contour respectively, measured in number of pixels. After marking the contours of the edge, the contours of characters within this outermost edge are detected and marked individually. Then, they are iterated from left to right and top to bottom to find the smallest \(x\) and \(y\) values and the largest (\(x+w\)) and (\(y+h\)) values (might not be from the same character’s contour).

The characters with the smallest \(x\) value and \(y\) value would be the one that is closest to the top left edge of the border. Similarly, the character with the largest (\(x+w\)) and (\(y+h\)) value would be the one that is closest to the bottom right edge of the border. After identifying them, the system can effectively find the limits within which characters lie in the image. Once these limits were identified, cropping is executed leaving a padding of 5 pixels.

Punch hole removal

Punch holes are the largest blob located inside the palm-leaf manuscripts. Punch holes may get segmented as characters and during character recognition, they may be misleading and reduce the recognition rate. The main aim of the punch hole removal algorithm is to locate the punch holes in the binarized image and produces a punch hole removed output using a mask. This algorithm also removes parts of manuscript border edges that might have evaded the Cropping Algorithm. The steps used to remove the punch hole are stated below.

  • Step 1: Identify the contours of all the components in the binarized image and save them in an array.

  • Step 2: Initialise \(num\_punch\_holes = 2\), \(holes\_found=0\), and two lists named \(holes\_list\) and \(edges\_list\)

  • Step 3: Sort contours array by area in descending order.

  • Step 4: Iterate through the sorted array starting from the largest contour. For each contour:

    • Step 4.1: Identify the coordinates of the contour

    • Step 4.2: If the aspect ratio of the coordinates falls within the interval (1-Ɛ, 1+ Ɛ), where Ɛ is a small real number, and if the area of the rectangles is greater than the minimum area (80 units), then:

      Append the particular contour object to the \(holes\_list\)

      Increment \(holes\_found\) by 1.

      Else:

      Append the particular contour object to the \(edges\_list\)

      Step 4.3: If \(holes\_found == num\_punch\_holes\), then:

      Exit Loop

  • Step 6: Create a mask i.e. a 2-dimensional array of the same size as the original binarized image and fill it with black (0).

  • Step 7: Map all the coordinates of the \(holes\_list\) and \(edges\_list\) onto the mask and fill the pixels within all these rectangles with white (255).

  • Step 8: Perform a bitwise XOR operation between the binary image and the created mask. It will return the Punch hole and edge removed image as its output.

During execution, the contours are iterated one by one from the largest contour. the contours which satisfy the conditions for a punch hole (aspect ratio close to 1, and size greater than minimum area) are searched and appended to the holes_list. In the process, the ones that have an aspect ratio >  > 1 or <  < 1 (the horizontally or vertically elongated contours) that do not satisfy the condition are selected and added to the edges_list. Because the elongated contours would belong to parts of the edges that might not be fully removed in the cropping process. This process continues till the ‘num_punch_holes’ number of holes has been identified, which is 2 in most cases. Once done, a mask with 255 for the contours and 0 elsewhere is created and binary XOR is performed between the mask and the binary image, which sets the contours with edges or holes to 255 (white), effectively removing both.

Segmentation of Tamil characters

The segmentation of characters is helpful to recognize language-specific characters. It handles the conflict that occurs due to different styles of handwritten characters. This is a three-step process that involves Manuscript Trisection, line segmentation, and character segmentation.

Manuscript Trisection

Often encountered in historical documents or handwritten manuscripts, the curvature of the text lines presents a significant obstacle to conventional segmentation methods, particularly when lines exhibit a drooping or arching pattern. When contours are used to segment characters and order them using the x and y coordinates, where the characters in the middle are present a little lower than the letters near the ends in the same first line, might be segmented to be part of the second or third line. This is because the ‘x’ coordinate of the first line's middle character might be comparable with the ‘x’ coordinates of the second line's end characters, and so on. To address this issue, we trisect the image, wherein the document image is horizontally divided into three equal parts along the vertical axis. This division strategy aims to alleviate the adverse effects of text curvature by effectively breaking down the document into smaller, more manageable segments.

Line segmentation

The proposed system uses a novel Augmented HPP line segmentation method to effectively segment the lines while handling wrong segmentations. Even after the trisection, contour-based and convex hull-based segmentation might not be appropriate where two or more characters are enclosed within one contour or convex hull. This is because the text is densely packed, exhibiting minimal spacing between consecutive lines. Thus, several letters might be in contact with the letters that are in previous or successive lines. Sometimes, the text could be horizontally crammed up as well. Hence if contour or convex hull-based techniques are to be used, it is a gargantuan task to properly segment these conjoined characters. Therefore, before handling horizontally conjoined characters, the proposed method initially segments the lines to handle the above-addressed situation.

One of the most prevalent line-splitting techniques is to make use of the Horizontal Projection Profile (HPP) of the image. HPP is a 1D array in which the nth element of the array represents the sum of pixel values of the nth row of the image. If the palm leaf segmented manuscript has a pixel \(f\left(i,j\right)\) in the ith row and the jth column, and the whole image is built of ‘a’ rows and ‘b’ columns, then HPP is computed using Eq. (9).

$$HPP\left[i\right]=\sum_{0<j<b}f\left(i,j\right), \quad where\, 0<i<a$$
(9)

If a specific row has no foreground (black or 0) pixels, then the sum of pixels in that row will be 255*a where ‘a’ is the number of columns in the binary image. But when the convex hull of the individual lines is overlapping, none of the rows would have a characteristic sum of white space. Each of the regions where the gap is supposedly present would have a different total sum. Hence it is necessary, in those cases, to find the maximal regions of the HPP. But in certain cases, Due to the innate nature of the text, numerous Maximas could be present very close to each other near the region of white space between lines. To prevent the occurrence of multiple close short splits, the smoothening approach is introduced in HPP before finding the maxima. The HPP array is treated as a 1D signal with numerous noises and Gaussian blur is applied to it. Gaussian Filter, defined in Eq. (10), replaces the number of Local maxima that are very close to each other with a single peak with its maxima being at the point of the largest maxima of that given group of local maxima. In Eq. (10), ‘a’ considers the number of rows, ‘b’ considers the number of columns, \(\pi =3.14,\) and \(\sigma \) represents the standard deviation.

$$Gauss\left(a,b\right)=\frac{1}{2\pi {\sigma }^{2}}{e}^{-\frac{({a}^{2}+{b}^{2})}{2{\sigma }^{2}}}$$
(10)

Maximas can be identified where the instantaneous slope of the HPP shifts from positive to negative. The major disadvantage of this method is the fact that it fails to take into consideration the conjoined, partially empty, and completely empty segments that were defined earlier.

In addition, these include noises such as damages, tears, tiny holes, blots of ink, and empty spaces are characterized as letters due to their low sum i.e. black noise. These noises are found together with other real lines. Therefore, the first and last lines of the manuscript might be abnormally tall with either their lower or upper portion being void of letters. If the above-mentioned noises are segmented as characters, it leads to incorrect character recognition and disrupts the actual order of segmented characters.

To tackle this, the concept of a 0/1 ratio is introduced. It is the ratio of the number of black pixels to the number of white pixels in a segmentation. Height analysis is carried out in every segmentation produced by the algorithm and the abnormally tall ones are separated based on a threshold. These segments are split again, and the 0/1 ratio of the individual subsegments is calculated. The ratio is higher for segments with text and lower for ones with just noise. If the subsegments were conjoined lines, then they would both separately pass the criterion. But partially empty lines will result in both true (high ratio) and empty subsegments (low ratio) while completely empty lines will result in only empty subsegments (low ratio). Those segments with a ratio lesser than a defined threshold (empty lines) are removed, and the ones that pass the criteria are ordered and saved. This eliminates the segmentations which do not have characters in them. This filtering will also be carried out to the segments within the permissible height range, to ensure that no empty lines accidentally creep into the saved folder of individual lines. Finally, the segments are saved in one of their three separate directories and used for character segmentation in the next step. Figure 7 depicts the work flow of the whole Line segmentation process in detail, including the intermediate results of trisection.

Fig. 7
figure 7

The Augmented HPP line segmentation Algorithm

Character segmentation

After line segmentation, the segments are iteratively fed into the character segmentation module in the right order as they are in the original palm leaf manuscript. The output of line segmentation has a single row of letters. A contour-based character detection technique is used here. The contours with characters from consecutive lines are avoided as the lines are already effectively segmented. But certain letters within the line are in contact with the next letters. For handling this issue, the bounding box method is coupled with non-Max suppression identification. By adjusting the overlapping threshold, the algorithm minimizes the number of double characters. The steps used for character segmentation are stated below.

  • Step 1: Read the Line image as input.

  • Step 2: Detect the contours in the binary image, and small noise is filtered out based on the contour area.

  • Step 3: Bounding rectangles are extracted from the filtered contours, and non-maximum suppression is applied to merge overlapping rectangles.

    • Step 3.1: Consider the list of rectangles.

    • Step 3.2: Compute the area of rectangles.

    • Step 3.3: Sort the rectangles based on their x-coordinate.

    • Step 3.4: Iterate through the sorted rectangles and compare them to find overlapping regions.

    • Step 3.5: Rectangles with significant overlap are removed from consideration, ensuring that only non-overlapping rectangles are retained. Finally, the function returns the selected rectangles after non-maximum suppression.

  • Step 4: Segmented characters are saved as individual images in a directory.

Recognition of Tamil characters

The character images extracted from each of the lines from each trisection must be fed to the recognition module to be classified. But before that, the character images must be reordered based on the trisection number, line number, and character number, such that they follow the actual sequence as in the manuscript.

Bucketing

In this work, the segmented characters from different manuscript books are used to create our own dataset after bucketing them into the right folders, such that each folder has 600 samples. The ones which fell short of 600 characters were augmented with existing samples, such that all folders contained the same number of samples. This dataset is later used to train the CNN model and used for recognition.

CNN based recognition

The objective of the recognition module is to train the deep CNN model to classify the characters segmented by the system, even from other manuscripts written by different scribers. The character dataset created consists of 125 written classes.

The images of segmented characters are given as input to the CNN as pixel matrices. Then each of the input images is convolved by different filters and kernels of convolution layers. Each convolution layer filter slides over the input image and executes element-wise multiplication. Finally, all the results are summed up together to generate the output feature map. In this proposed work, the input image patch \({I}_{P}\) is convoluted using a number of input maps \({I}_{n}\) with filter size \({A}_{I}\times {A}_{I}\) and generates output maps \({O}_{m}\) with size \({A}_{O}\times {A}_{O}\). The input is adjusted with zero padding, which maintains the dimension of input I. Assume L to be the number of layers, \({Z}^{L}\) to be the output layer, and \({Z}^{L-1}\) to be the input layer. The output of \({Z}^{L-1}\) is given as the input to the \({Z}^{L}\). The xth output feature map FM of output layer L is computed by using Eq. (11) which performs convolution operation, \(\sigma \) performs a non-linear transformation, and the bias value \({b}_{x}^{L}\) is multiplied by each and every element of the matrix \({1}_{{A}_{O}}\).

$${FM}_{x}^{L}=\sigma \left(\sum_{y}{Z}_{y}^{L-1}*{w}_{yx}^{L}+{b}_{x}^{L}{1}_{{A}_{O}}\right)$$
(11)

The pooling layer is implemented after executing the convolution layer. The downsampling is executed in the pooling layer to reduce overfitting, for example, Mean pooling. The feature map of the pooling layer output (OFM) is the (c, d) element of the generated output feature map \(x\) of layer l, as denoted in Eq. (12), where \(c\le 0\) and \(d\le {A}_{I}\).

$${OFM}_{x}^{L}\left(c,d\right)=\frac{{\sum }_{q=0}^{ds-1}\sum_{p=0}^{ds-1}{Z}_{x}^{L-1}(ds\times c+q,ds\times d+p)}{{ds}^{2}}$$
(12)

The feature data points generated from the pooling layer are expanded into single-column features to be fed into deep neural networks. The final layer is connected to a fully connected layer. The output of this final layer, as given in Eq. (13), represents the output map \({D}_{O}^{L}\) of a concatenated layer, which is \({D}_{O}^{L}\times {({C}_{O}^{L-1})}^{2}\).

$${FM}_{x}^{L}\left(\text{0,0}\right)={FM}_{y}^{L-1}\left(c,d\right)$$
(13)
$$x=y\times {({C}_{O}^{L-1})}^{2}+\left(d-1\right)\times {C}_{O}^{L-1}+c$$
(14)

The probability of ath training sample of ‘N’ letters belonging to classes \({L}_{1}\), \({L}_{2}\), …, \({L}_{N}\) and parameter matrix \(\{{\theta }_{{L}_{1}}^{T}, {\theta }_{{L}_{2}}^{T},\dots , {\theta }_{{L}_{N}}^{T}\}\) with size \({L}_{N}\times {D}_{O}^{L}\) is given in Eq. (15).

$$P\left(N|{OFM}^{L(a)};\theta \right)=\frac{{e}^{{\theta }_{N}^{T}{d}^{{L}^{a}}}}{\sum_{z=1}^{{L}_{N}}{e}^{{\theta }_{N}^{T}{d}^{{L}^{a}}}}$$
(15)

The SoftMax classifier is used for multi-categorical character recognition. The cost function C for ath character belonging to the class \({L}_{N}\) is given in Eq. (16) and the parameter \(\theta \) is identified using the maximum likelihood estimation approach.

$${C}^{a}=P(N|{OFM}^{{L}^{a}};\theta )$$
(16)

The number of layers used in our proposed work and the configurations of each layer is given in Table 2. Initially, one surrounding layer of zeros is added around the input image for padding. Then, Conv 1 uses 3 input channels and applies 32 \(3\times 3\) filters to perform convolution operations and generate the 32 output channels. The output of Conv 1 is given as the input to Conv 2 after adding one layer of zeros around the input image for padding. It uses 32 input channels and applies 64 filters to perform convolution operation, with a filter size of \(3\times 3\). This generates 64 output channels. The Conv 2 output is in turn given as the input to Conv 3 once again after adding one layer of zeros around the input image for padding. It generates 128 output channels by using the 64 input channels and applying 128 filters for performing convolution, again with the filter size \(3\times 3\). The pooling layer considers a pooling window size of 2, which moves over the input feature map with a stride length of 2 and returns the maximum value within the row. In this manner, the pooling window reduces the input feature map by a factor of 2, retaining only the most significant features suitable for character recognition. Finally, the fully connected layer maps input to one of the 125 classes based on the significant features. The accuracy and loss functions are computed for evaluating the performance of the proposed character recognition model.

Table 2 The configuration of CNN layers used for character recognition

Results and discussion

This section describes the results of Pre-Processing, Character Segmentation, and Character Recognition from Palm leaf manuscripts with necessary examples.

Results of palm leaf pre-processing

The quality of preprocessing determines the overall accuracy of the model in the end. Figure 8(a) shows the sample image of the manuscript. Figure 8(b) shows the same manuscript after denoising with the Fast-NLM method, while Fig. 8(c) shows the binarized image. Though there might not be much difference between (a) and (b) visually, (d) shows the binarized image without F-NLMD. A closer look at the comparison (e) shows the significance of denoising.

Fig. 8
figure 8

Pre-processing results

Figure 9(a) shows the characters and the outermost contour in their respective colour, using the proposed method. Figure 9(b) shows the pre-processed image after identifying cropping with a padding of 5 pixels.

Fig. 9
figure 9

Cropping and Punch hole removal

Figure 10(a) is chosen as the input to the punch hole removal algorithm, as it also has an evaded edge part visible. Then, the punch holes and edges are identified (in red) by sorting in Fig. 10(b). Figures 10(c) and d show the created binary mask, and the result of XOR operation between the input and the mask, respectively. The final image is the desirable image with the punch hole impressions and the evaded edges (if present) removed.

Fig. 10
figure 10

Working on the punch hole removal approach

Similarly, manuscripts from other books considered in this study can also be fed as input to the algorithm which outputs their pre-processed versions. Figure 11 provides a few examples, displaying both the original images and their corresponding pre-processed outputs.

Fig. 11
figure 11

Other manuscripts considered in this study with their corresponding final pre-processed output images

The performance of pre-processing is measured by using the Peak Signal-to-Noise Ratio (PSNR) value as per Eq. (17), where \({Intensity}_{Levels}\) denotes the maximum intensity levels. MSE is the Mean Square Error computed based on Eq. (18) and RMSE is Root MSE. Here \(O\left(c,d\right)\) is the original image, \(P(c,d)\) is a pre-processed image, while ‘a’, ‘b’, ‘c’, and ‘d’ represent the number of rows, the number of columns, the row index, and the column index of the image, respectively. It compares the original and reconstructed image after pre-processing to examine the quality of the reconstructed image.

$$PSNR={10log}_{10}\left(\frac{{Intensity}_{Levels}-1}{MSE}\right)={20log}_{10}\left(\frac{{Intensity}_{Levels}-1}{RMSE}\right)$$
(17)
$$MSE=\frac{1}{ab}\sum_{c=0}^{a-1}\sum_{d=0}^{b-1}{(O\left(c,d\right)-P(c,d))}^{2}$$
(18)

The PSNR values of different common denoising methods compared with our work are tabulated in Table 3. As compared with other denoising methods such as Gaussian Blur and Median Filtering, the proposed Fast Non-Local Means denoising coupled with image smoothening using Median filter offered a higher PSNR value of 38.86. It proves that the proposed method generates more effective results during pre-processing.

Table 3 PSNR values computed for different denoising methods

Results of segmentation

This section offers the sample results of segmentations with necessary discussions.

Results of HPP-based line segmentation

After trisection, each segment is individually given as input to the line segmentation part. Figure 12 shows the HPP graph for the sample trisection. It is clear from the image that it is not possible to choose a local maximum within each of the local Maximas clusters.

Fig. 12
figure 12

Horizontal projection profile of a sample trisection

Figure 13 shows the results of Smoothening of the HPP of the segment in Fig. 11. Figure 13(a) indicates the HPP after Gaussian smoothening, and Fig. 12(b) superimposes it over the original HPP for better visual comparison. The Maximas are identified in the smoothened images and the lines are segmented there.

Fig. 13
figure 13

Smoothening of the HPP

These are further segmented and filtered using the 0/1 ratio as discussed previously. The sample output of line segmentation of a manuscript segment is given in Fig. 14. It has optimally segmented lines, which are used further for accurate character segmentation.

Fig. 14
figure 14

Five sample segmented lines

Results of character segmentation

A sample of the character segmentation process is displayed in Fig. 15, which has single-segmented characters only. To avoid double-segmented characters, overlapping threshold values are manually adjusted in this work.

Fig. 15
figure 15

Sample segmented characters

All the segmented characters were manually bucketed into respective classes to create the character dataset of 125 distinct writable characters as mentioned in Sect. "Bucketing".

Performance of multi-class CNN for Tamil character recognition

Figure 16 analyses the performance of the proposed multi-class model using accuracy in Fig. 16(a) and loss in Fig. 16(b). In both figures, the x-axis represents the number of epochs and the y-axis denotes the performance measure, such as accuracy or loss. In Fig. 16(a), the training, validation, and testing accuracies are plotted for 30 epochs. At the end, the proposed model achieves a training accuracy of 98.25%, a validation accuracy of 96.78%, and a testing accuracy of 96.04%. Similarly, in Fig. 16(b), the training, validation, and testing losses are plotted. Ultimately, the proposed model achieves a training loss of 0.10%, a validation loss of 0.19%, and a testing loss of 0.21%. Figure 17 shows the predicted classes for a few segmented characters.

Fig. 16
figure 16

Performance of multi-class CNN for Tamil character recognition

Fig. 17
figure 17

Classes predicted by CNN

Assessment with state-of-the-art methods

The comparative assessment is helpful to analyse how the performance of our proposed work differs or has improved as compared with existing related works in the literature. Here, the performance comparison is made in two different ways, (1) Comparison with already existing CNN-based approaches in Tamil and other languages, and (2) existing palm leaf character recognition systems.

Assessment of the proposed segmentation technique with other approaches used for manuscripts

The effectiveness of the proposed Augmented HPP technique combined with the character segmentation method is measured in terms of Segmentation accuracy, defined in Eq. (19).

$$Segmentation\; Accuracy=\frac{Count \;of\; correctly\; segmented\; letters\; in\; a\; Manuscript}{Total \;number\; of\; letters\; in\; that\; Manuscript} \times 100$$
(19)

Fifty manuscripts were randomly selected from each of the five books mentioned in Sect. "Data Acquisition". Hence a total of 250 manuscripts were used to determine the segmentation accuracy of the proposed method. The total number of characters in each manuscript was recorded, and the segmentation accuracy was calculated based on the number of characters accurately segmented in each manuscript. The proposed method achieved an impressive average segmentation accuracy of 98.25%. The segmentation accuracy of the proposed technique is compared with existing state-of-the-art approaches in Table 4, and the same has been visualised in Fig. 18 for a more intuitive comparison. The results clearly show that the proposed technique outperforms all of them, demonstrating its superior effectiveness and reliability for character segmentation in manuscripts.

Table 4 Comparison of segmentation accuracy of the proposed model with existing techniques
Fig. 18
figure 18

Visualisation of several techniques comparing their segmentation accuracies

Assessment of existing CNN-based approaches in Tamil and other languages

A comparison of existing multi-class CNN-based palm leaf character recognition systems with our proposed system is presented in Table 5. For this analysis, multi-class CNNs used for recognizing characters from different languages in palm-leaf manuscripts were considered. Works like that of Sabeenian et al. [20] achieve slightly higher accuracy. But in this case, only 50 manuscripts written by the same scriber were considered for this purpose. Also, the CNN used was trained for classifying only 20 characters. In contrast, the proposed work can recognize all 247 letters and 12 numeric digits in Tamil language and achieved an accuracy of 96.04%. The proposed system is also trained on samples from various scribers, making it suitable for real-life applications where the scriber of a manuscript might be unknown.

Table 5 Comparison of multi-class CNN based existing Palm leaf character recognition systems

Assessment of existing palm leaf character recognition systems other than CNN

The comparative assessment of existing palm leaf character recognition systems, other than CNN, with our proposed modified CNN, is illustrated in Fig. 19. This comparison demonstrates that the features extracted by the CNN are sufficient for the detection of Tamil characters. Methods like B-spline curve-based optimization methods [22], 3D features followed by transform-based methods [23], 3D features followed by KNN lazy learner [25], are time-consuming due to the extraction of features using separate methods followed by classification. The same is true for other methods that focus on extracting 3D features and employing feature extraction over them. These feature extraction techniques include Histogram Profile [28], Cell-wise Count [29] and Histogram of Gradients (HoG) [30]. Using KNN with feature optimization techniques like Differential Evolution (DE) do not produce a comparatively high accuracy either. In contrast, CNN layers are configured to extract features sufficient for detecting Tamil characters during palm leaf character recognition, achieving a peak accuracy of 96.04%.

Fig. 19
figure 19

Assessment of existing palm leaf character recognition systems

Conclusion and future perspectives

In this work, an intelligent character segmentation coupled with deep learning-based recognition was proposed. The main focus of this work is to create an automated Manuscript reader that can work efficiently for real-time Archaeological and historical applications. The first step of acquiring palm leaf manuscripts is performed by using scanners to scan the Agasthiyar Vaithiya Kaviyam, Ramayanam, Thruvilayadal, and other medical palm leaf sets. In the second step, pre-processing detected and removed noise cropped the content, and removed the punch holes. In the third step, the row-overlapped characters are handled by a three-stage segmentation process. The Augmented HPP method proposed in this work can handle segmentation cases that were previously not considered by existing segmentation techniques. The novel Punch Hole removal algorithm effectively locates and removes the Punch hole impressions in the manuscript images. An automated content cropping algorithm is also introduced to reduce the manual work required in real-time applications. Finally, Tamil Character Recognition is performed by Modified CNN with 125 classes. A significant feature of this CNN is that it can recognize all 247 letters and 12 numeric characters in Tamil language with the limited 125 classes. It significantly reduces the complexity of the Network. While analyzing the performance of the proposed intelligent Tamil character recognition model, the system attained an average segmentation accuracy of 98.25%, a recognition accuracy of 96.04%, and a loss of 0.21%. At the end of this work, an assessment with existing methodologies was carried out, proving that the existing feature extraction, machine learning, and variants of CNN models have lower recognition power. In the future, the proposed model will be implemented as a hardware product for recognizing Tamil characters from palm leaves. This technology can be modified to be trained in any language, making it useful to anyone involved in archaeological or historical research including reading manuscripts.

Availability of data and materials

The datasets used and/or created during the current study available from the corresponding author on reasonable request.

References

  1. Jailingeswari I, Gopinathan S. Tamil handwritten palm leaf manuscript dataset (THPLMD). Data Brief. 2024;53:110100. https://doi.org/10.1016/j.dib.2024.110100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Elsamanoudy G, Abdelaziz Mahmoud NS, Alexiou P. Handwoven interior accessories from palm leaves as sustainable elements. J Cult Herit Manag Sustain Develop. 2024. https://doi.org/10.1108/JCHMSD-05-2023-0054.

    Article  Google Scholar 

  3. Khadijah ULS, Winoto Y, Shuhidan SM, Anwar RK, Lusiana E. Community participation in preserving the history of heritage tourism sites. J Law Sustain Develop. 2024;12(1):e2504. https://doi.org/10.55908/sdgs.v12i1.2504.

    Article  Google Scholar 

  4. Salman F. Holy quranic manuscripts: examining historical variants and transmission methods. J Islamic Studies. 2024;7(1):1163–77. https://doi.org/10.31943/afkarjournal.v7i1.793.

    Article  Google Scholar 

  5. Kesiman MWA, Valy D, Burie JC, Paulus E, Sunarya IMG, Hadi S, Ogier JM. Southeast Asian palm leaf manuscript images: a review of handwritten text line segmentation methods and new challenges. J Electron Imaging. 2017;26(1): 011011. https://doi.org/10.1117/1.JEI.26.1.011011.

    Article  Google Scholar 

  6. Lian X, Yu C, Han W, Li B, Zhang M, Wang Y, Li L. Revealing the Mechanism of Ink Flaking from Surfaces of Palm Leaves (Corypha umbraculifera). Langmuir. 2024;40(12):6375–83. https://doi.org/10.1021/acs.langmuir.3c03946.

    Article  CAS  PubMed  Google Scholar 

  7. Wang Y, Wen M, Zhou X, Gao F, Tian S, Jue D, Zhang Z. Automatic damage identification of Sanskrit palm leaf manuscripts with SegFormer. Herit Sci. 2024;12(1):8. https://doi.org/10.1186/s40494-023-01125-w.

    Article  Google Scholar 

  8. Jindal A, Ghosh R. An optimized CNN system to recognize handwritten characters in ancient documents in Grantha script. Int J Inf Technol. 2023;15(4):1975–83. https://doi.org/10.1007/s41870-023-01247-1.

    Article  Google Scholar 

  9. Nair BB, Rani NS. HMPLMD: Handwritten Malayalam palm leaf manuscript dataset. Data Brief. 2023;47:108960. https://doi.org/10.1016/j.dib.2023.108960.

    Article  CAS  Google Scholar 

  10. Devi SG, Vairavasundaram S, Teekaraman Y, Kuppusamy R, Radhakrishnan A. A deep learning approach for recognizing the cursive tamil characters in palm leaf manuscripts. Comput Intell Neurosci. 2022. https://doi.org/10.1155/2022/3432330.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Thuon N, Du J, Zhang J. Improving isolated glyph classification task for palm leaf manuscripts. In: International Conference on Frontiers in Handwriting Recognition. Cham: Springer. 2022;13693: pp. 65–79.

  12. Basha SJ, Veeraiah D, Pavani G, Afreen ST, Rajesh P, Sasank MS. A novel approach for optical character recognition (OCR) of handwritten Telugu alphabets using convolutional neural networks. IEEE. 2021; pp. 1494–1500. https://doi.org/10.1109/ICESC51422.2021.9532658.

  13. Sánchez-DelaCruz E, Loeza-Mejía CI. Importance and challenges of handwriting recognition with the implementation of machine learning techniques: a survey. Appl Intell. 2024;54:6444–65. https://doi.org/10.1007/s10489-024-05487-x.

    Article  Google Scholar 

  14. Jindal A, Ghosh R. A semi-self-supervised learning model to recognize handwritten characters in ancient documents in Indian scripts. Neural Comput Appl. 2024;36:11791–808. https://doi.org/10.1007/s00521-023-09372-5.

    Article  Google Scholar 

  15. Abbas Ali Alkhazraji A, Khudair B, Mahdi Naser Alzubaidi A. Ancient Textual Restoration Using Deep Neural Networks. In: BIO Web of Conferences. 2023;97:pp 64–69. https://doi.org/10.1109/AICCIT57614.2023.10218159.

  16. Haldorai A, Babitha Lincy R, Suriya M, Balakrishnan. An End-to-End Offline Handwritten Tamil Text Identification Using Modified RAdam Optimizer with Effective Post-processing Techniques. 2024; pp. 317–340. https://doi.org/10.1007/978-3-031-53972-5_16

  17. Surinta O, Chamchong R. Image Segmentation of Historical Handwriting from Palm Leaf Manuscripts. In: Shi Z, Mercier-Laurent E, Leake D (eds) Intelligent Information Processing IV. IIP 2008. IFIP – The International Federation for Information Processing, 2008; 288: 182–189. https://doi.org/10.1007/978-0-387-87685-6_23

  18. Chamchong R, Chun C, Fung C. Character segmentation from ancient palm leaf manuscripts in Thailand. 2011. p. 16–17. https://doi.org/10.1145/2037342.2037366.

  19. Mohamed Sathik M, Spurgen Ratheash R. Text Line Segmentation In Tamil Language Palm Leaf Manuscripts – A Novel Approach. 2021; 54(4): 297–304.

  20. Sabeenian RS, Paramasivam ME, Anand R, Dinesh PM. Palm-leaf manuscript character recognition and classification using convolutional neural networks. In: Peng SL, Dey N, Bundele M, editors. Computing and Network Sustainability Lecture Notes in Networks and Systems. Springer: Singapore; 2019. p. 397–404.

    Google Scholar 

  21. Kumar SS, Santhosh B, Guruakash S, Savaridass MP. AI Based Tamil Palm Leaf Character Recognition. Third International Conference on Smart Technologies, Communication and Robotics. 2023;1:1–7.

    Google Scholar 

  22. Athisayamani S, Singh AR, Athithan T. Recognition of ancient Tamil palm leaf vowel characters in historical documents using B-spline curve recognition. Procedia Computer Science. 2020;171:2302–9. https://doi.org/10.1016/j.procs.2020.04.249.

    Article  Google Scholar 

  23. Narahari SP, Vijaya LTR, Rama K, Koteswara RNV. Modeling of palm leaf character recognition system using transform-based techniques. Pattern Recogn Lett. 2016;84:29–34. https://doi.org/10.1016/j.patrec.2016.07.020.

    Article  Google Scholar 

  24. Sudarsan D, Sankar D. Development of an effective character segmentation and efficient feature extraction technique for Malayalam character recognition from palm leaf manuscripts. Sādhanā. 2023;48(3):156. https://doi.org/10.1007/s12046-023-02181-5.

    Article  Google Scholar 

  25. Lakshmi TRV, Sastry PN, Krishnan R, Rao NVK, Rajinikanth TV, Analysis of Telugu Palm Leaf Character Recognition Using 3D Feature. In: International Conference on Computational Intelligence and Networks. 2015; pp 36–41, https://doi.org/10.1109/CINE.2015.17.

  26. Sastry PN, Krishnan R. Isolated Telugu Palm leaf character recognition using Radon Transform—A novel approach. In: 2012 World Congress on Information and Communication Technologies. IEEE. 2012; pp. 795–802.

  27. Vijaya TR, Panyam N, Kanth R. A novel 3D approach to recognize Telugu palm leaf text. Eng Sci Technol Int J. 2016;20:1. https://doi.org/10.1016/j.jestch.2016.06.006.

    Article  Google Scholar 

  28. Vijaya Lakshmi TR, Sastry PN, Rajinikanth TV. Feature selection to recognize text from palm leaf manuscripts. 2018;12:223–9. https://doi.org/10.1007/s11760-017-1149-9.

    Article  Google Scholar 

  29. Vijaya Lakshmi TR, Sastry PN, Rajinikanth TV. Feature Optimization to Recognize Telugu Handwritten Characters by Implementing DE and PSO. Techniques. 2017;516(2):397–405. https://doi.org/10.1007/978-981-10-3156-4_41.

    Article  Google Scholar 

  30. Vijaya Lakshmi TR, Sastry PN, Rajinikanth TV. Telugu character recognition for degraded palm leaf documents using optimal feature selection techniques - a 3D approach. 2017; 10(5): 223–230. https://doi.org/10.1504/IJSISE.2017.087764

  31. Jyothi RL, Abdul Rahiman M. A multilevel CNN architecture for character recognition from palm leaf images. In: Intelligent Computing and Communication: Proceedings of 3rd ICICC 2019, Springer: Singapore. 2020; 3: pp. 185–193.

  32. Sivan R, Palaniswamy S, Pati PB. Malayalam Character Recognition from Palm Leaves Using Deep-Learning, OITS International Conference on Information Technology (OCIT), Bhubaneswar, India, 2022; pp. 134–139, https://doi.org/10.1109/OCIT56763.2022.00035.

  33. Sivan R, Palaniswamy S, Pati PB. Comparative Study of Deep Learning models to Recognize Palm Leaf Malayalam Characters. In: 6th International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS). 2022; pp. 1–6. https://doi.org/10.1109/CSITSS57437.2022.10026392.

  34. Puarungroj W, Kulna P, Soontarawirat T, Boonsirisumpun N. Recognition of Thai Noi characters in palm leaf manuscripts using convolutional neural network. In: Asia-Pacific Conference on Library & Information Education and Practice (A-LIEP).2019; pp. 408–415.

  35. Puarungroj W, Boonsirisumpun N, Kulna P, Soontarawirat T, Puarungroj N. Using deep learning to recognize handwritten thai noi characters in ancient palm leaf manuscripts. In: Digital Libraries at Times of Massive Societal Transition: 22nd International Conference on Asia-Pacific Digital Libraries, ICADL. 2020; 12504(22):pp. 232–239. https://doi.org/10.1007/978-3-030-64452-9_20

  36. Antony PJ, Savitha CK. Segmentation and recognition of characters on Tulu palm leaf manuscripts. Int J Comput Vision Robotics. 2019;9(5):438–57.

    Google Scholar 

  37. Shikha C, Sonu M, Vivek S. Ancient text character recognition using deep learning. Int J Eng Res Technol. 2020;13(9):2177. https://doi.org/10.37624/IJERT/13.9.2020.2177-2184.

    Article  Google Scholar 

Download references

Acknowledgements

The first author is thankful to Anna University Chennai for supporting the research. Ethics approval and consent to participate: Not Applicable Funding:Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

S.U.M. conceived the presented idea, performed data collection, designed the innovative aspects of the study, and finalized the modules and techniques to be used. Additionally, S.U.M. wrote the manuscript with inputs from P.U.M. G.R.S.A. assisted in designing the innovative aspects of the study and played a key role in conceiving and developing the code. P.U.M. verified the analytical methods, and supervised the findings of the work and approved the final manuscript.

Corresponding author

Correspondence to S. Uma Maheswari.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maheswari, S.U., Maheswari, P.U. & Aakaash, G.R.S. An intelligent character segmentation system coupled with deep learning based recognition for the digitization of ancient Tamil palm leaf manuscripts. Herit Sci 12, 342 (2024). https://doi.org/10.1186/s40494-024-01438-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40494-024-01438-4

Keywords