KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classification

Sasithradevi, A.; Sabarinathan; Shoba, S.; Roomi, S. Mohamed Mansoor; Prakash, P.

doi:10.1186/s40494-024-01167-8

Research
Open access
Published: 19 February 2024

KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classification

A. Sasithradevi¹,
Sabarinathan²^na1,
S. Shoba¹^na1,
S. Mohamed Mansoor Roomi³^na1 &
…
P. Prakash⁴^na1

Heritage Science volume 12, Article number: 60 (2024) Cite this article

792 Accesses
1 Citations
Metrics details

Abstract

In India, kolam, commonly referred to as rangoli, is a traditional style of art. It involves using rice flour, chalk, or coloured powders to create elaborate patterns and motifs on the ground. Kolam is a common daily ritual in many regions of India, especially in South India, where it is seen as a significant cultural tradition and a means to greet visitors. Unfortunately, as a result of people’s hectic lives nowadays, the habit of drawing kolam on a regular basis is dwindling. The art of making kolam patterns is in danger of disappearing as so many individuals no longer have the time or space to do it on a regular basis. Therefore, it is imperative that ancient art be conserved and digitally documented in order to enlighten our next generation about kolam and its classifications. Deep learning has become a powerful technique because of its ability to learn from raw image data without the aid of a feature engineering process. In this article, we attempted to understand the types of Kolam images using the proposed deep architecture called KolamNetV2. KolamNetV2 comprises EfficientNet and attention layers, ensuring high accuracy with minimal training data and parameters. We evaluated KolamNetV2 to reveal its ability to learn the various types in our challenging Kolam dataset. The experimental findings show that the proposed network achieves fine enhancement in performance metrics viz, precision-0.7954, recall-0.7846, F1score-0.7854 and accuracy-81%. We compared our results with state-of-the-art deep learning methodologies, proving the astounding capability.

Graphical Abstract

Introduction

India is a multicultural country with a deep cultural history. India’s cultural richness has deep roots that stem from ancient traditions. One such distinctive ancient practise is creating lovely floor murals. In several Indian states, it goes under many names. A few of the names are”joti” in Oriya,”satiya” in Gujarati,”alpna” in Bengali,”kolam” in Karnataka,”sanskara bharathi” in Maharastra,”kolam” in Tamil, and”pookolam” in Kerala. Even though kolam goes by several names, its main idea is supposed to bring wealth and joy.The ethnic diversity of kolam is shown in Fig. 1. Kolam is a particular sign for hospitable gods and nobility. According to the epic, Kolam draws in gods, and beautiful artwork is painted on the ground to welcome gods into homes. India is the birthplace of Kolam, an auspicious exquisite art that dates back 5000 BC [1, 2]. The Indian ancestors had a firm belief that earth, water, fire, air, and space are the primary components involved in every part of human existence. They used to mimic these components’ geometric structures as a means of shielding themselves from harmful vibrations. Drawing kolam, which disperses uplifting cosmic energy via pictorial patterns, is anticipated in front of the house. The Ramayana explains the meaning of kolam. To keep Sita safe, Lakshmana surrounds the cottage with a ring of celestial powers. The Ramayana’s sundara kanda describes the splendour of the Vibishana Kingdom and makes reference to the exquisite Kolam art. Krishna gave Subhatra instructions in the Mahabharata to draw Kolam on a spotless floor with symbols like as fruits, flowers, diamonds, and so on in order to get God’s blessing. Kolam is drawn by Subhatra on Sankha Chakra Gopadma. In order to sense Krishna’s presence, Gopikas would also create images of him on a spotless floor. Tamil Nadu considers the month of Marchaghi to be among the holiest months for worshipping Lord Vishnu. In order to appease God and get good vibes throughout the month of Margazhi, women would adorn their entrance doors with exquisite Kolam designs in the early morning [2, 3]. The purpose of Kolam is to honour and give appreciation to the soil goddess who carries our bodies and possessions. The three primary locations where Kolam is drawn are the temple entrance, the kitchen shrine, and the entryway. Kolam is often regarded as a holy work of art and especially dew by women in temples; people avoid it and don’t walk on it [4]. Kolam is a kind of floor art that purges bad luck from a dwelling and offers strong divine energy. According to the results of several investigations [5,6,7,8,9,10], Kolam has a traditional value and style. Before the advent of modern civilization, kolam art was once a daily routine practised by women. The ingredients used to draw kolam mostly include rice flour, turmeric, and lentils. The procedures involved in creating a typical kolam drawing include sweeping, wetting the floor, using cow dung to wax it, dotting the surface, and creating the design with rice flour. In Indian tradition, rice is considered a symbol of wealth. It is also thought to be providing food for the animals that come to homes. However, artificially tinted flour is a common practise in contemporary India. Drawing the exquisite works of art known as kolam is considered a kind of meditation. It quiets the observer’s and painter’s minds. The Kolam painter adheres to the whole custom of purification, which offers a sense of mental and spiritual purification. In order to replicate Ottawa aasana’s pose, the kolam painter must stand with his or her feet apart and their upper body slightly bowed towards the ground. To produce the enchanting kolam designs, the painter adopts this stance: holding a bowl of flour in one hand, they slide the flour onto the cleansed floor. In addition to indirectly including many yoga postures like chair, squat, and thunderbolt positions, the kolam practice stretches the body. Combining these poses with a healthy diet of fresh air enhances blood flow, mobility, activity level, and flexibility of the body. Every day, the routine of drawing kolam would need around twenty minutes. It is impressive to see that doing kolam drawing on a daily basis improves attention, concentration, and memory. Additionally, coming up with fresh kolams will stimulate creative thinking. Hence, kolam offers good vibrations to the body, mind, and soul because of its strong traditional values. The curving line around the dots in the kolam painting pattern represents the infinite journey that mankind takes throughout life [7]. It demonstrates unequivocally how yoga was coordinated with everyday routines and occupations by ancient Indians.

Mathematical understanding of kolam

Kolam has deep mathematical underpinnings via spatial dots, pattern imagination, and algebraic comprehension, in addition to its cultural relevance [11]. Kolam designers use a geometric measure of floor to draw the floor art to develop designs. Kolam designers often use their mathematical knowledge to expand any little geometric motif into a larger one. In computer programming, the idea of iteration is represented by the extension process. Dots, arranged in a matrix of dots, and the straight and curved lines that link them are the basic components of kolam. The array’s shapes may include hexagonal, triangular, octagonal, and other shapes. The array of dots is rhythmically repeated, which helps to create lovely geometric designs. Conventional wisdom claims that the symmetrical arrangement of dots depicts a star constellation in the sky, whereas scientists interpret it as a lattice model [7]. Kolam is shown to be a limited collection of combinational possibilities of lines that join. As to reference [12], in order to finish a kolam pattern, three required procedures need to be taken (see Fig. 2). They are.

Step 1: All dots in the array constellation should be connected through line or curved segments. Step 2: No overlap between the line segments over a finite length. Step 3: Closing of all the connected segments required. An infinite number of kolam can be drawn for N = 1 dots. Different mathematical ideas such as graphs, array patterns, matrices and images, Tiling patterns, monoidal algebra, topological models of knots, and self avoiding mirror curving route have been discovered by researchers as being involved in kolam patterns or motifs. The computational and mathematical properties of kolam were described in depth by the authors in [11] using five components: patterns, symmetry, mirror curves, fractals, cyclic order, and iteration. Additionally, an education model of rangoli designs and its cultural practices were investigated [13]. The Fibonacci sequence is generated when the Kolam matrix is generated, according to Naranan [14]. The study discovered that a finite number of Fibonacci based kolam motifs may be produced in girds that are square, rectangular, or diamond in shape. Kolam is classified as fractal geometry [15, 16] because of its endless line segments across a limited area, self similarity, complexity, and large number of features.

Kolam is a holy theme that is a part of India’s cultural legacy, rich in tradition, science, and mathematical qualities. Because of cultural advancement and the drive to get office work done, the everyday practise of kolam is in danger of disappearing. Even if just a small percentage of Instagram users and content producers are using the kolam approach, the act of frequent practise is steadily decreasing. The kolam, a visual ethnographic framework of the country, is becoming an extremely important piece of cultural property that has to be digitally recorded and conserved. Photographs are becoming a crucial part of the digital tools that future generations will use for storage and distribution. Understanding historical art, kolam pictures, and their many digital forms is essential to passing on to our descendants the cultural customs upheld by the ancient Indians. As a preliminary attempt to investigate the Tamil traditional art form—kolam images—through vision and machine learning techniques, KolamNet was introduced in [17]. In order to increase KolamNet’s ability to discriminate between 13 distinct kinds of Kolam, we improved KolamNet as KolamNetv2. The remaining portions of the text are arranged as follows: The linked works in the categorization of cultural heritage images are expounded upon in Sect. ‘‘Related Works’’. The Kolam dataset collection procedure is explained in Sect. ‘‘Kolam dataset creation’’. Sect. ‘‘Architecture of KolamNetV2 for Kolam Images Classification’’ presents the suggested approach for categorising the provided Kolam image. Sect. ‘‘Experimental findings and discussion’’ assesses the suggested approach using performance indicators including accuracy, precision, recall, and F1 score. This research work is concluded in Sect. ‘‘Conclusion’’.

Related works

Here are two categories of cultural heritage: tangible and intangible. Paintings, sculptures, monuments, archaeology, and paintings are examples of tangible legacy. Folk, traditional dances, and art exhibitions are examples of cultural practises that are associated with intangible heritage. Kolam, the practise of creating floor art, is a material part of cultural history. Cultural heritage (CH) includes things like monuments, artefacts, a group of buildings and places, and museums. Numerous qualities are associated with these objects, such as historical, symbolic, anthropological, artistic, scientific, and social importance. Digitising heritage is an essential part of maintaining and mending cultural assets. The first essential stage in the digitization process is classifying digital photos. Since the colours, textures, and shapes of several CH asset photos are similar, classification is challenging.

To categorise heritage images, a convolutional neural network based method is created [18, 19]. Historical murals are very artistically meritorious and cover a wide range of important subjects. The accurate classification of these paintings is a difficult task for scholars, one that can be difficult even for experts in the field [20]. A multi- channel separable network model (MCSN) which uses the GoogLeNet network model as its basis, uses a compact convolution kernel to extract features from the shallow layer backdrop. To improve the classification rate and remove unwanted characteristics added to the picture, training of classification models were focussed on pertinent architectural material within the image [21]. Query based picture label retrieval and image classification using transfer learning algorithms is followed in [21] and a method for creating natural language descriptions of heritage photographs is designed in [22]. A Machine learning technique for classifying Thailand architecture images is discussed in [23]. A retrieval model is also developed which appends new heritage images to defined categories efficiently. Modality Cross Attention (MMCA) Network for picture and text matching which integrates sentence words and image regions into a single deep model is introduced in [24]. The MMCA architecture incorporates a special cross attention mechanism. This mechanism takes advantage of the inter modality interaction between image regions and sentence words as well as the intra modality relationship within each modality with skill. A Content Based Image Retrieval (CBIR) system that effectively retrieves relevant photos using the Deep Search and Rescue (SAR) Algorithm is presented in [25]. The suggested Deep Neural Network-SAR (DNN-SAR) involves several procedural stages, including preprocessing, multiple feature extraction, feature fusion, grouping, and classification. Image segmentation relies heavily on partitional clustering, and K-means is a well known [26] but highly sensitive method that is noisy and prone to converge to local optima based on initial cluster centers. Moreover, the K-means algorithm’s computation time is increased by the repeated calculating of distances between cluster centers and pixels.

A thorough review of the literature on the subject of classifying historical images reveals that deep learning techniques are now the mainstay of solutions. Building a system capable of learning about and anticipating the object of interest is the main objective of deep learning techniques. CNN [18] differs from the original artificial neural network in that it adds a hidden layer in addition to the artificial neuron network. There are many levels in the CNN architecture, including subsampling and convolutional layers. Recently, the majority of academics have shown interest in applying transfer models to soon to be solved vision based challenges, such as Googlenet, Alexnet, and VGGNet. Since digital archiving of kolam pictures is our area of interest, a practical and lightweight deep neural network is necessary. The plan is to build a mobile application for recording kolam pictures by integrating the designed model into a mobile platform. In order to begin our study on Kolam images, we gathered images belonging to five different categories: Swastik, footprint, geometric, animal, and plant themes. We created the first version, KolamNet [17], being motivated by the EfficientNet model’s [27] strength and lightweight design. Afterwards, we filled up the thirteen categories with the majority of the often occurring Kolam photos. The significance of kolam documentation and categorization stems from its application to gender aesthetics, community interaction, ethnographical predominance, and mathematical treasure. It is also possible to create an automated process based on vision that verifies the quality of Rangoli or Kolam artwork. To the best of our knowledge, no study has looked at vision based comprehension for recording and preservation of the traditional art form known as kolam. As a hybrid of EfficientNetB4, Dense, and parallel channel attention blocks, we provide KolamNetV2. We deduce that EfficientNet uses compound scaling and stacking of neural layers to learn the info rich characteristics of kolam pictures. The network benefits from DenseNet’s improved rich feature flow, and attention blocks facilitate the process of identifying the important characteristics of Kolam pictures.

The novel contributions present in this article are:

We address kolam as a fine art that needs digital heritage documentation.
We establish a new kolam dataset of thirteen common categories.
We design efficient deep neural network model namely KolamNetV2 on kolam dataset for kolam image classification.
We conduct a comparative analysis of the intrinsic features of every pretrained model in the classification of kolam pictures.

Kolam dataset creation

A variety of wellknown cultural heritage archives, including Wikiart, the Society of Architectural Historians in association with Artstor in Chicago (SAHARA), and the Cultural Object Name Authority (CONA), offer organized collections of resources for artefacts, architectural sculptures, paintings, drawings, furniture, textiles, and visual records related to the arts. While Wikiart offers two lakh and fifty thousand art photos collected from over 100 countries worldwide, SAHARA only features one lakh photographs pertaining to architecture and landscapes. Additionally, collections of historical paintings, drawings, sculptures, and sketches from all around the globe have been made available by museums like the Metropolitan Museum of Art (Met) and EgsArt. Artworks from 5000 years ago may be found, in particular, in the Met collection [28]. The Indian government has taken a commendable step with Digital Hampi [29], using digital elements to preserve and restore Hampi’s ancient values and legacy. It facilitates information sharing and visual searches regarding sculptures and inscriptions. None of the sources offered a comprehensive picture collection for Kolam Fine Art, despite the fact that we possessed digital archives to document the historical materials.

The lack of available data motivated the creation of a fresh dataset for several types of kolam pictures. According to the findings of the study referenced as [30], it has been observed that kolam exhibits many categorizations depending on the motifs used, including but not limited to butterfly, cow, floral, fish, footprint, geometric, kamal, kurma, loop, naga, parrot, peacock, and Tulsi. Figure 3 depicts a collection of diverse kolam designs. Butterfly kolam denotes fertility and is believed to be good fortune. The cow has a revered status within the Hindu religion and is considered a symbol of agricultural sustenance. The Hindu community venerates the cow as a deity due to its role in fulfilling the nutritional requirements of young children via the provision of dairy products. Furthermore, panchakarma, which involves the excrement of cow, is a prominent component in the majority of rites. The floral kolam is symbolic of a promising future and plenty. The creeper motif symbolizes the ongoing expansion of the familial unit. Floral themes have a significant position in the ceremonial reception of the bride and groom. Many floral designs have a resemblance to the Lotus or Kamal, which is recognized as the national flower of India. The lotus plant originates from muddy environments, deriving its nourishment from the submerged dirt. As it grows, it emerges from the water and attains higher heights, ultimately blossoming in vibrant hues. The provision of confidence and hope for mankind to overcome challenging conditions is observed. The practice of meditation on the lotus symbol is used by temple priests as a means to attain Vedic strength and vitality. The theme of fish has significant importance in Kolam, since it is seen as a divine blessing bestowed by the deity of the sea. The significance of Lord Vishnu in Hindu scripture is worth mentioning. The piscine manifestation of Lord Vishnu is recognized as one of the avataric forms assumed by the deity, with the primary purpose of safeguarding people from tempestuous weather conditions. Furthermore, it is worth noting that the fish symbol has significance since it is intricately inscribed on the hand of Lord Buddha. The footprint is regarded as a sacred symbol and is prominently used in a wide range of religious rites. The footprint is often regarded as a sign of Lord Mahalakshmi in the. northern region of India, representing the deity’s association with riches, plenty, and success. In the southern region of India, it is customary to create a representation of a footprint on the sacred day of Janmashtami. This footprint symbolises the footsteps of Lord Krishna as he enters households during this auspicious celebration. The geometric motif encompasses a collection of straight and curved lines that are arranged in a pattern, such as the swastika, star, square, diamond, rectangle, and other similar shapes. The majority of yantra designs use this particular theme for the sake of meditation and ritualistic practises. The use of geometric motifs has the potential to enhance human psychic abilities. During auspicious events, it is customary to depict this symbol in temples and residences. The representation of the tortoise as one of the oldest incarnations of Lord Vishnu is shown by the symbolism of Kurma. The tortoise has a significant position within the cosmological beliefs of both Hinduism and Chinese culture. The three strata of the tortoise symbolise the terrestrial realm, the atmospheric domain, and the celestial sphere. The loop motif is a significant category of kolam designs that has mathematical attributes and incorporate knots. A daily ritual involves the act of drawing it at the door of one’s residence. This symbol is recognised for its function in safeguarding individuals from malevolent adversaries. The loop pattern is seen in both the archaeological site of Mohenjodaro and several celestial structures found in Europe. The Naga sign is said to have resemblance to the DNA structure found in both humans and Kananaskis. In addition to the customary task of painting at home, priests create an expansive naga mandala during homa rituals. The parrot, a visually appealing avian species, is said to be associated with the deity Goddess Andal. This symbolises the concepts of love, serenity, and desire. The Peacock is renowned for its aesthetic appeal and has the distinction of being designated as the National bird of India. In Hindu mythology, the peacock is regarded as the chosen vehicle of both Goddess Saraswathi and Lord Murugan. It is believed to possess the ability to safeguard people from the malevolent effects of the evil eye. Tulsi, often known as basil, is a botanical species recognised for its significant therapeutic properties and deep rooted cultural significance. The object in question is seen as a representation that exalts the deity Mahalakshmi. Table 1 presents the quantitative data pertaining to the quantity of photos collected for each kind of kolam.

Table 1 Details of kolam dataset

Full size table

The images were gathered from online search engines by using keywords corresponding to the various forms of kolam. We obtained assistance from neighbouring individuals, religious establishments, and sites of competitive events. Determining the quantum number of photos for various categories such as kurma, cow, fish, naga, and footprint poses significant challenges. Our research strategy included the collection of kolam pictures from printed publications. The dataset presents many challenges because to its diverse backdrops, the prevalence of standard hues across most kinds, the inclusion of both coloured and non colored versions of the same type of kolam, and variations in viewpoint. Given the aforementioned obstacles, it is essential to develop a highly effective deep learning network for the purpose of classifying the photos inside the kolam dataset.

Architecture of kolamNetV2 for kolam images classification

Our proposed pipeline consists of two major phases namely Offline and Query phase. During Offline phase our KolamNetV2 will learn about different types of kolam images and the learned knowledge is stored for classification purpose. In the query phase, user will send the kolam image of unknown category. Our pipeline will classify the category of kolam using the learned knowledge base.

The proposed KolamNetV2 deep network architecture comprises EfficientNet, a Dense channel attention module (DCAM) [30] and a parallel spatial channel module (PSCM) [31]. EfficientNet was proposed to circumvent the usual scaleup bottlenecks of convolutional neural network topologies. DCAM and PSCM are integrated into the EfficientNet (EN) architecture in this paper to learn the high level spatial properties of Kolam pictures. Mathematically, dots, non overlapping finite length lines and curves in kolam can be represented as distinct patterns and structures. In our kolamNetV2 model convolutional layers learn filters that detects these features. Dots are represented as localized points, lines as linear structures and curves as the combination of specific convolution patterns enabling the understanding of spatial relationships and arrangement within the image. By incorporating these mathematical representations, kolamNetV2 captures the intricate details of Kolam patterns during training and classification. The overall deep architecture of KolamNetV2 is shown in Fig. 4.

EfficientNet architecture

A convolutional neural network design called EfficientNet (EN) balances model accuracy and computational limitations. It achieves this by scaling up the depth, resolution, and width of the network’s layers using preset scaling coefficients, a technique known as compound scaling. These coefficients are calculated using a grid search algorithm that balances memory and hardware requirements by scaling various CNN components. Because the scaling components of EN are autonomous, the design is more robust than conventional scaling strategies. The compound scaling method also offers a better balance between the network’s channels and layer count. EN uses wider receptive fields and increases network depth to capture fine grained features of the image when processing huge input images to make up for the several layers conventional networks need. The neural architecture of EN is created to maximize the model’s accuracy and latency. As a result, it may be used easily and quickly in mobile apps to categorize and validate kolams while maintaining high accuracy. Assume the compound scaling coefficient as θ, the uniform scaling on depth (d), width (w), and resolution (r) is represented as,

$$d = \alpha ^{\phi } ,\;w = \beta ^{\theta } ,\;r = \gamma ^{\theta }$$

(1)

where αβ²γ² 2. Grid search is used to find these constants. The EfficientNet family of models, with versions ranging from ENB0 to ENB7, are scaled using the compound scaling method, starting from the ENB0 architecture as its basis. We compared the effectiveness of various EfficientNet model iterations in this study and found that the EfficientNetB4 version was best suited to tackle the issue. Different units make up the fundamental building blocks of the EfficientNetB4 design; their structure is shown in Fig. 5. The entire architecture of EfficientNetB4 was built using these components in combination. The compound scaling method in this architecture assigns specific values for the dimensions, with a depth scaling coefficient of 1.8, a resolution scaling coefficient of 300, and a width scaling coefficient of 1.4.

Dense channel attention module

DCAM aids the entire network in learning about the essential meaningful and distinct patterns in Kolam image categories. Considering the channel axis of DCAM, the output in each layer is obtained by concatenating consecutive layers. Assume the result of the i-th layer as y_i and can be written as

$$y_{i} = f\left( {\left| {X_{0} } \right|,{\mkern 1mu} {\mkern 1mu} \left| {X_{1} } \right|,{\mkern 1mu} \left| {X_{2} } \right|,......{\mkern 1mu} \left| {X_{{i - 1}} } \right|} \right)$$

(2)

To retain useful information over the number of channels at each layer, it is necessary to consider the rising rate of channels. To tackle this, we introduce channel attention in the dense block. We use two DCAMs at each level as encoder, combined with the decoder output after the up sampling process using the skip step. The result is serially provided to the next DCAM, followed by PSCM.

Parallel spatial channel attention module

Feature maps’ inter channel features are captured by the channel attention module (CAM) approach. Feature maps can be thought of as feature detectors in the context of Kolam image identification, and CAM offers a knowledge of ”what” is present in the image. Employing average and maximum pooling for feature aggregation, it boosts the network’s strength compared to using these pooling methods alone. CAM accomplishes this by extracting the structural properties from the feature map. With this method, two separate feature maps—ρ^avg and ρ^max, are generated and input into a multi layer perceptron network (MLPN) to produce a channel attention map.

Element wise addition is used to produce the final channel attention feature. This channel attention process can be expressed mathematically as:

$$F_{c} \left( \rho \right) = \sigma \left( {\omega_{1} \left( {\omega_{0} \left( {\rho^{avg} } \right)} \right) + \omega_{1} \left( {\omega_{0} \left( {\rho_{c}^{\max } } \right)} \right)} \right)$$

(3)

where ω₀ and ω₁ are the weights of the first layer of an MLP that transform the input data using a linear operation, which is followed by a ReLU activation function that applies a non linear operation to the intermediate set of values.

The SAM (Spatial Attention Module) creates a feature that extracts the network features’ inter spatial data. Two attention modules, CAM and SAM, coordinate to capture various traits during the entire process. By calculating the attention factor of”where” it is positioned, SAM focuses on the important components of the kolam. Compared to CAM, SAM has a distinct aggregation strategy. On the channel axis, average and maximum pooling processes are carried out independently. The generated features are then concatenated and undergo a typical convolution operation at the final stage. The spatial attention mechanism is defined mathematically as

$$F_{s} \left( \rho \right) = \sigma \,{\text{conv}}\left( {\rho_{{{\text{avg}}}}^{s} ,\rho_{\max }^{s} } \right)$$

(4)

where ρ^s_avg ρ^s_max represent the average and maximum pooling operations respectively.

In this work, the sequential organization of CAM and SAM is replaced by the parallel spatial channel attention module, which is performed like that of [13]. Figure 6 shows the detailed architecture of KolamNetV2. The dimension of different blocks is listed in Table 2.

Table 2 Details of KolamNetV2

Full size table

Experimental findings and discussion

This section clearly elaborates on the experimental work carried out to analyse the performance of KolamNetV2 to classify the given Kolam images into one of the 13 categories. We discuss about the performance metrics, experimental protocol and findings in detail.

Performance metric

The following are the performance metrics used in the evaluation of KolamNetV2:

Precision: It shows the ability of KolamNetV2 to accurately find the class of Kolam images. It is calculated as:
$${\text{Precision}} = \frac{{{\text{True Positive}}}}{{{\text{True Positive}} + {\text{False Positive}}}}$$
Recall: Used to evaluate the performance of any classification system, Recall is defined as the proportion of true positives out of the sum of true positives and false negatives:
$${\text{Recall = }}\frac{{\text{True Positive}}}{{{\text{True Positive}} + {\text{False Negative}}}}$$
F1 Score: Shows the overall performance of the model using precision and recall.

It is defined as the harmonic average of precision and recall:
$${\text{F1}}\,\,{\text{Score}} = \frac{{{2} \times {\text{Precision}} \times {\text{Recall}}}}{{{\text{Precision}} + {\text{Recall}}}}$$
Accuracy: Relates the predicted class label by KolamNetV2 with the actual class label of images:
$${\text{Accuracy}} = \frac{{{\text{True Positive}} + {\text{True Negative}}}}{{{\text{True Positive}} + {\text{False Positive}} + {\text{True Negative}} + {\text{False Negative}}}}$$
Confusion Matrix: Useful in analyzing the strength and weakness of any classification model. It compares the predicted label with the actual class label for each sample in the dataset.

Experimental protocol

Various parameters were applied to augment the images, including a rotation range of 90 degrees, a width shift range of 0.1, a height shift range of 0.1, and horizontal and vertical flips. The image files were downsized to 224 × 224, and the Adam optimizer was used to update weights during training. A learning rate of 0.001 was initially set, and if validation loss did not improve after five epochs, it was dropped to 10%. The categorical cross entropy loss function was applied during training, and the batch size was set at 2. 250 epochs were set up, but training ceased as soon as the network started to overfit. A Nvidia 1080 GTX GPU was used to train the dataset. For the purpose of evaluation, we employ the same baselines for all deep models. To prevent overfitting and needless data training, we adopted an early stopping technique.

Performance evaluation of EfficientNet and its offspring on the kolam dataset

We began using EfficientNetB0 to conduct our experiments on the kolam dataset. When EfficientNetB0’s validation loss cannot be further improved, the training process is terminated. Table 3 lists the performance metrics obtained using EfficientNetB0. For the loop motif, the greatest precision is 0.96, while for the Tulsi motif category, it is 0.91. The third class, creeper, has the highest recall rate, at 0.97. A maximum average precision and recall value yields the highest F1score for the loop kolam motif. The total accuracy that was attained was 74%.

Table 3 Performance evaluation on efficientNetB0

Full size table

To enhance the accuracy, we tried the different offspring of EfficientNet, namely EfficientNetB1, EfficientNetB2, EfficientNetB3 and EfficientNetB4. The proposed research problem is investigated using EfficientNetB1 and shows a good precision and recall of 0.95 and 0.98 for footprint and creeper motifs, respectively. Surprisingly, Tulsi motif that achieved high precision by EfficientNetB0 shows a slightly reduced precision of 0.89 when evaluated by EfficientNetB1.The performance of EfficientNetB1 is enlisted in Table 4. EfficientNetB1 attains a slightly enhanced accuracy of 78%. The performance of EfficientNetB2 is enlisted in Table 5. Notably, Tulsi motif gained a high precision of 0.97 and a good recall of 1 was achieved by creeper kolam.

Table 4 Performance evaluation on efficientNetB1

Full size table

Table 5 Performance evaluation on efficientNetB2

Full size table

Subsequently, we used evaluation criteria to analyze the performance of the next network, named EfficientNetB3, and found that the footprint motif outperformed all other categories in terms of precision rate, with the cow motif achieving the lowest precision score of 0.53. Table 6 shows that the creeper motif has the highest recall score of 0.97, while the loop motif has the best score of 0.83. EfficientNetB3’s accuracy of 75% is comparatively low in comparison to its successors’ descendants.

Table 6 Performance evaluation on efficientNetB3

Full size table

We then began to examine EfficientNetB4. According to the results, the loop and kamal motifs have the best precision and recall, respectively (see Table 7). EfficientNetB4’s accuracy of 77% indicates a slight lag in comparison to EffectiveNetB0. Inspired by EfficientNetB4’s capacity to solve invariance difficulties with optimal hyperparameters, we decided to improve EfficientNetB4 while we constructed and analyzed KolamNet based on it.

Table 7 Performance evaluation on EfficientNetB4

Full size table

Furthermore, we compared the training and validation loss functions to examine how various EfficientNet progeny learn. The training and validation loss learning curves seen in several EfficientNets are displayed in Fig. 7. At the 10th epoch, EfficientNetB0 exhibits minimal validation loss, as can be seen. We selected EfficientNetB4 for additional analysis since it demonstrates consistent performance over all 255 epochs.

Performance evaluation of proposed kolamnet and kolamNetV2

We evaluated the suggested KolamNet and KolamNetV2 topologies using the same methodology. Table 8 provides a clear understanding of KolamNet’s performance, indicating that the highest precision and recall for the Tulsi and creeper motifs, respectively, are 0.94 and 0.98. For the cow and naga themes, the minimum precision and recall scores are 0.61 and 0.5, respectively. The creeper motif yields the greatest F1Score of 0.95 when generated by KolamNet. Additionally, KolamNet attained a 77% accuracy rate.

Table 8 Performance evaluation on KolamNet

Full size table

By sandwiching dense attention between the EfficientNet and parallel spatial channel attention layers of KolamNet’s architecture, we were able to improve its accuracy and create KolamNetV2. Table 9 lists KolamNetV2’s performance metrics. It obtains an average F1Score of 0.7646, recall of 0.7784, and precision of 0.7769. Table 10 provides the comparison between KolamNet and KolamNetV2. For state-of-the art techniques and KolamNetV2, the average precision, recall, and F1Score value are calculated. It is evident that KolamNetv2 outperforms state-of-the art techniques in terms of performance. In order to comprehend the advancement over the epochs, we also compared the learning curve, which is shown in Figs. 8, 9 shows the performance of all the deep networks that were analyzed for the Kolam documentation process.

Table 9 Performance evaluation on KolamNetV2

Full size table

Table 10 Performance comparison of KolamNetV2 with state of the art techniques

Full size table

Conclusion

KolamNetV2, an innovative deep network design, is presented in this article. It strongly emphasizes the fundamental value of dense channel attention and the parallel spatial channel attention module. KolamNetV2’s impressive performance highlights how well it can classify the thirteen different types of kolam patterns that are included in the large Kolam dataset. Comprehensive testing has conclusively shown that KolamNetv2 outperforms both KolamNet and conventional EfficientNet techniques in terms of performance. Interestingly, without adding to the cost of hardware implementation, the convolution block module’s integration into the EfficientNet design has markedly improved network performance. KolamNetV2’s lightweight design makes it a perfect tool for recording and safeguarding the rich history of kolam images, which are an essential component of India’s digital legacy. In addition, classifying and disseminating Kolam images globally could have a beneficial impact and create new avenues for future generations to learn about India’s classic artistic methods. As this deep architecture develops further, it could be extremely important in preserving the art of making kolams, which are particularly significant in many Indian festivals and state specific festivities. Furthermore, this technique may be used to efficiently separate the desired Kolam image from chaotic backgrounds, which presents another interesting use for this creative discovery.

Availability of data materials

Dataset can be provided upon reasonable request.

References

https://www.tamilnadutourism.com/culture/kolam.html. Accessed: 30 Nov, 2023.
Kannabiran, G., Reddy, A.V.: Exploring kolam as an ecofeminist computational art practice. In: Proceedings of the 14th Conference on creativity and cognition, pp. 336–349. 2022,
Venkat, I., Robinson, T., Subramanian, K., De Wilde, P.: Generation of kolam- designs based on contextual array p systems. In: Diagrammatic Representation and Inference: 10th International Conference, Diagrams 2018, Edinburgh, UK, June 18–22, 2018, Proceedings 10, pp. 79–86. Springer. 2018.
Sridharan S. Women in hindu temple art: Their auspicious presence and unmarked absence. Religion and the Arts. 2023;27(1–2):157–78.
Article Google Scholar
Narayanan, V.: Matters that matter: Material religion in contemporary hinduism. In: Routledge Handbook of Contemporary India, pp. 329–346. Routledge, ??? (2015)
Murugan, I., Perumal, V., Kamarudin, K.M.: Challenges in the practice of tra- ditional kolam among indian women in the klang valley, malaysia. International Journal on Sustainable Tropical Design Research & Practice 14(1): 2021
Sarin A. The kolam drawing: a point lattice system. Des Issues. 2022;38(3):34–54.
Article Google Scholar
Kucharsky NH, Waring S, Atmaca TM, Beheim S. Limited scope for group coordination in stylistic variations of kolam art. Front Psychol. 2021;12:742577.
Article PubMed PubMed Central Google Scholar
Tran N-H, Waring T, Atmaca S, Beheim BA. Entropy trade-offs in artistic design: a case study of tamil kolam. Evol Human Sci. 2021;3:23.
Article Google Scholar
Srinivasan, R.: Scalable hridaya kolam and aishwarya kolam. Journal of Mathe- matics and the Arts, 1–16. 2023.
Metilda MM, Lalitha D. Generative capacity of kolam patterns using tile past-ing rules physics conference series. Bristol: IOP Publishing; 2021.
Google Scholar
Krithivasan, K.: A view of india through kolam patterns and their grammatical representation. The Mind of an Engineer, 375–384. 2016.
Surapaneni KM. Enriching anatomy education with the integration of ran- goli: nurturing cultural practices in medical education. Med Sci Educ. 2023;33(5):1293–1293.
Article PubMed Google Scholar
Naranan, S., Thiruvanmiyur, C. Kolam designs based on fibonacci numbers. 2007.
Ranjazmay Azari M, Bemanian M, Mahdavinejad M, Knippers J. Application-based principles of islamic geometric patterns; state-of-the-art, and future trends in computer science/technologies: a review. Heritage Sci. 2023;11(1):22.
Article Google Scholar
Malishevsky, A.: Applications of fractal analysis in science, technology, and art: A case study on geography of ukraine. In: 2020 IEEE 2nd International Conference on System Analysis & Intelligent Computing (SAIC), pp. 1–6. 2020.
Anbalagan, S., Shoba, Nathan, S., Roomi, M.M.: Kolamnet: An atten- tion based model for kolam classification. In: Proceedings of the Thirteenth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 1–6. 2022.
Liu E. Research on image recognition of intangible cultural heritage based on cnn and wireless network. EURASIP J Wirel Commun Netw. 2020;2020:1–12.
Article CAS Google Scholar
Belhi A, Bouras A, Al-Ali AK, Foufou S. A machine learning frame- work for enhancing digital experiences in cultural heritage. J Enterp Inf Manag. 2023;36(3):734–46.
Article Google Scholar
Cao J, Jia Y, Chen H, Yan M, Chen Z. Ancient mural classification methods based on a multichannel separable network. Heritage Sci. 2021;9(1):1–17.
Article CAS Google Scholar
Obeso, A.M., V´azquez, M.S.G., Acosta, A.A.R., Benois-Pineau, J.: Connoisseur: classification of styles of mexican architectural heritage with deep learning and visual attention prediction. In: Proceedings of the 15th International Workshop on Content-based Multimedia Indexing, pp. 1–7. 2017.
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137. 2015.
Prasomphan, S.: Toward fine-grained image retrieval with adaptive deep learn- ing for cultural heritage image. Computer Syst Sci Eng. 44(2) (2023)
Wei, X., Zhang, T., Li, Y., Zhang, Y., Wu, F.: Multi-modality cross attention network for image and sentence matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10941–10950. 2020.
Keisham N, Neelima A. Efficient content-based image retrieval using deep search and rescue algorithm. Soft Comput. 2022;26(4):1597–616.
Article Google Scholar
Das, A., Dhal, K.G., Ray, S., G´alvez, J.: Histogram-based fast and robust image clustering using stochastic fractal search and morphological reconstruction. Neural Computing and Applications, 1–24 (2022)
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neu- ral networks. In: International Conference on Machine Learning, pp. 6105–6114 (2019). PMLR
http://www.metmuseum.org/press/news/2017. Accessed: Aug, 2023
http://www.digitalhampi.in/. Accessed: Sep, 2023
Tadvalkar, N. A language of symbols: Rangoli art of india. Traditional Knowledge and Traditional Cultural Expressions of South Asia. Edited by Sanjay Garg. Colombo: SAARC Cultural Centre. 173–86. 2015
Nathan, S., Kansal, P.: Skeletonnet: Shape pixel to skeleton pixel. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0. 2019.

Download references

Funding

Not applicable.

Author information

Sabarinathan, S. Shoba, S. Mohamed Mansoor Roomi and P. Prakash have contributed equally to this work.

Authors and Affiliations

Centre for Advanced Data Science, Vellore Institute of Technology, Chennai, TamilNadu, India
A. Sasithradevi & S. Shoba
Couger Inc, Shibuya-ku, Tokyo, Japan
Sabarinathan
Department of Electronics and Communication Engineering, Thiagarajar College of Engineering, Madurai, TamilNadu, India
S. Mohamed Mansoor Roomi
Department of Electronics Engineering, MIT, Anna University, Chennai, Tamil Nadu, India
P. Prakash

Authors

A. Sasithradevi
View author publications
You can also search for this author in PubMed Google Scholar
Sabarinathan
View author publications
You can also search for this author in PubMed Google Scholar
S. Shoba
View author publications
You can also search for this author in PubMed Google Scholar
S. Mohamed Mansoor Roomi
View author publications
You can also search for this author in PubMed Google Scholar
P. Prakash
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SA—prepared manuscript and ideation S—Experimentation SS—Tabulated the results SMMR—Figures and explanation PP-Data Collection and proof reading.

Corresponding author

Correspondence to A. Sasithradevi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Sasithradevi, A., Sabarinathan, Shoba, S. et al. KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classification. Herit Sci 12, 60 (2024). https://doi.org/10.1186/s40494-024-01167-8

Download citation

Received: 17 November 2023
Accepted: 02 February 2024
Published: 19 February 2024
DOI: https://doi.org/10.1186/s40494-024-01167-8

KolamNetV2: efficient attention-based deep learning network for tamil heritage art-kolam classification

Abstract

Graphical Abstract

Introduction

Mathematical understanding of kolam

Related works

Kolam dataset creation

Architecture of kolamNetV2 for kolam images classification

EfficientNet architecture

Dense channel attention module

Parallel spatial channel attention module

Experimental findings and discussion

Performance metric

Experimental protocol

Performance evaluation of EfficientNet and its offspring on the kolam dataset

Performance evaluation of proposed kolamnet and kolamNetV2

Conclusion

Availability of data materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords