Fast adaptive multimodal feature registration (FAMFR): an effective high-resolution point clouds registration workflow for cultural heritage interiors

Accurate registration of 3D scans is crucial in creating precise and detailed 3D models for various applications in cultural heritage. The dataset used in this study comprised numerous point clouds collected from different rooms in the Museum of King Jan III’s Palace in Warsaw using a structured light scanner. Point clouds from three relatively small rooms at Wilanow Palace: The King’s Chinese Cabinet, The King’s Wardrobe, and The Queen’s Antecabinet exhibit intricate geometric and decorative surfaces with diverse colour and reflective properties. As a result, creating a high-resolution full 3D model require a complex and time-consuming registration process. This process often consists of several steps: data preparation, registering point clouds, final relaxation, and evaluation of the resulting model. Registering two-point clouds is the most fundamental part of this process; therefore, an effective registration workflow capable of precisely registering two-point clouds representing various cultural heritage interiors is proposed in this paper. Fast Adaptive Multimodal Feature Registration (FAMFR) workflow is based on two different handcrafted features, utilising the colour and shape of the object to accurately register point clouds with extensive surface geometry details or geometrically deficient but with rich colour decorations. Furthermore, this work emphasises the challenges associated with high-resolution point clouds registration, providing an overview of various registration techniques ranging from feature-based classic approaches to new ones based on deep learning. A comparison shows that the algorithm explicitly created for this data achieved much better results than traditional feature-based or deep learning methods by at least 35%.


Introduction
The cultural heritage (CH) preservation field is currently experiencing a growing demand for highresolution and high-quality 3D data in the form of point clouds.Precision 3D scanning is essential for accurately documenting the present state of an object's preservation [1].One can infer the CH object's condition by analysing the acquired scan data.By conducting repeated scans over time, it is possible to track and document changes in the object's state of preservation [2,3].Additionally, CH objects are exposed to different factors that cause their deterioration and degradation over time due to human activities and environmental factors.To protect and maintain cultural heritage objects and sites, it is essential to conduct architectural documentation, which involves creating 3D point clouds, 3D models, orthoimages, and vector drawings [4].Such documentation enables a comprehensive and detailed understanding of the object or site and is crucial for preservation and restoration efforts.By generating accurate and precise digital representations of CH objects, documentation provides a valuable resource for research, education, and public awareness [5][6][7].While many methods are available for digitising CH objects, such as 3D scanning or close-range photogrammetry, high-quality 3D point clouds are often necessary to record the intricate details of these objects accurately [8][9][10].
However, registering multiple 3D point clouds to create a complete digital representation of cultural heritage objects can be challenging, requiring a highly skilled operator and specialised tools.There are multiple registration methods available, each suited to different types of point cloud data acquired from various sources, such as 3D scanners [11], close-range photogrammetry [12], LiDAR (Light Detection and Ranging [13]), and structured light (SL [14]).These different point cloud data sources may have varying characteristics, such as point density, noise, and accuracy.In addition to the scanning technique, the different cultural heritage objects can also exhibit variations in surface parameters such as geometric details and colour.To handle the diversity in point cloud characteristics arising from different scanning techniques and surface properties, specific registration methods must be employed to ensure the accurate alignment of the 3D data.Therefore, it is crucial to evaluate the available registration methods [15] and select the one that best suits the specific needs of the particular CH documentation project.The salient pointbased methods have mainly focused on searching for key points and calculating descriptors to match different point clouds of the same objects [16].Neural networkbased methods [17] have emerged as a promising alternative, but their effectiveness has primarily been demonstrated in industry, where objects are more uniform, homogenous, and repetitive [18,19].Despite their effectiveness in industrial settings, these methods are not always suitable for registering CH objects due to the unique nature of each manifested difference in geometry, surface, or colour.Additionally, the learning process for these methods can be lengthy, and training sets for cultural heritage objects are often not readily available, making their use impractical in many cases.Moreover, due to the uniqueness of their surfaces, it is challenging to prepare a proper training set which could be adapted to supervised learning.
To conclude, the previously mentioned point cloud registration methods can be divided into two main categories: pairwise and multiview registration [20].The selection of the appropriate method depends on the number of point clouds to be processed.Utilising multiview registration methods requires determining approximate point cloud orientation parameters and knowing the order of the processed point clouds.For this reason, the quality of determining the relative orientation between the point clouds affects the accuracy of the final bundle adjustment.
Data in this study consist of a substantial amount of point clouds, captured by a structured light scanner, from various rooms in the Museum of King Jan III's Palace in Warsaw.Those measurements produced high-resolution architectural documentation in the form of point clouds with a resolution of 100 points per square millimetre.This high resolution was necessary to analyze micro-scale degradation and shape changes accurately.Additionally, these requirements and resolutions were driven by the needs of the museum's conservators.The structured light (SL) method was chosen because it can accurately map the object's shape and colour.The surfaces of the point clouds used in this study are rich geometrically and decorative through surface colour and different reflective properties.As a result, multiple registration methods with varying parameters would be necessary, requiring extensive and time-consuming work by skilled operators to align all point clouds into a single 3D model, totalling approximately 14 billion measurement points.Given these considerations, there is an apparent necessity to develop an efficient and fast workflow for registering point clouds in this case.
This study aimed to propose a novel pairwise workflow, named Fast Adaptive Multimodal Feature Registration (FAMFR), used for highly accurate point cloud registration for CH object's interiors.The proposed FAMFR workflow is based on two steps: (1) initial point cloud registration relying on local geometry and colour at each point of the point cloud and (2) final pairwise registration.To detect tie points (initial registration), two approaches were developed: one based on the histograms of RGB intensity gradients and the other based on the relation between normal vectors in a local neighbourhood, similar to the point pair feature (PPF [16]).The final pairwise registration is completed by using a modified commonly used ICP (Iterative Closest Point [21,22]) method based on the selection of correspondent points based on a texture/colour similarity metric.The obtained pairwise registration results are used, in a further step, by the final bundle adjustment of the multiple-point clouds.
The research follows a structured approach with a literature review of Related works, followed by Materials which describe the datasets used in the study.The Methods section then presents the algorithms, schemes, and overall workflow used to register point clouds of CH objects.The Results and Discussion sections present and comment on a comparative analysis of various state-ofthe-art methods and registration outcomes.The paper concludes with a summary of the findings, critical analysis, and future directions for similar research.

Related works
The increasing availability of various sensors and methods for CH documentation, namely: ultra-light Unmanned Aerial Vehicles (UAVs) [23], devices such as laser scanners or triangulation scanners [24], mobile phones with built-in LiDAR sensors [25], and close-range photogrammetry software [26] facilitated the generation of point clouds with much wider dissemination than in previous decades.This has led to more documentation projects in the cultural heritage domain as the ease of obtaining point clouds has improved.Non-professionals can contribute to CH documentation, preservation, and restoration efforts by capturing and sharing high-quality point clouds of heritage sites.This has enabled a more comprehensive range of people to engage in the process of digitally documenting and preserving CH.As a result, it has become easier to identify, study, and restore important cultural heritage sites, leading to a greater understanding and appreciation of our shared history and culture.
3D point clouds have become an essential data source for digitisation in the CH field.They are widely used for tasks such as Historical Building Modelling (HBIM), Structural Health Monitoring (SHM), damage detection, documentation, and virtual restoration [27].Point clouds can accurately represent the geometry and colour of the object's surface, making them a valuable tool for documenting and preserving CH objects.Point clouds are employed for a wide range of tasks, including segmentation [28], classification [29], 3D documentation [30], and modelling applications [31].For most cultural heritage objects' 3D documentation, an additional task, such as registration, is required to produce a complete representation.This is especially true for objects that require multiple point clouds to depict a complete structure, terrain, or CH complex surface.Obtaining data from a single 3D scanner position is impractical for large CH objects and sites, and multiple point clouds need to be registered into one 3D representation.It is done by aligning multiple point clouds into a common coordinate system.It is done by finding and applying the best 3D transformation between them; see Fig. 1.

Fig. 1 Visualisation of the point clouds registration process
Consequently, numerous registration methods have been developed to address this issue.Some examples include registering TLS (Terrestrial Laser Scanning) point clouds using two ICP variations [32], constant radius features for large scale and detailed CH object registration [33], automatic registration of overlapping point clouds using external information acquired from corresponding images [34] or employing feature-based methods [35].These workflows are designed to register and combine multiple point clouds into a comprehensive and accurate representation of the object or site of interest.Usually, the registration process is divided into two stages.The first involves an initial matching step, where point clouds are roughly aligned.The second stage utilises the iterative closest point algorithm for fine matching, which can accurately tune the initial 3D transformation between the point clouds (Fig. 2).There is plenty of variants of ICP algorithms which are widely used in point cloud alignment workflows.The ICP algorithm is dependent on accurate enough initial matching.
One of the main challenges in point cloud registration for CH objects is the vast differences in shape, texture, colour, rich decorations, and varying state of preservation.These objects were created in various historical periods and stored under different conditions.Additionally, occlusions and measurement noise are unavoidable in point clouds and can affect the final model.Varying lighting conditions and partially reflective and transparent surfaces can lead to some colour and geometry reconstruction imperfections, adding errors during point cloud registration workflows [36].
Point cloud registration techniques can be categorised into three main groups: feature-based (hand-crafted), iterative, and deep learning.Each category can be further classified based on whether they use geometric information, colour information, or both see Fig. 3.
Choosing the correct registration method is crucial to handle differences between point clouds because geometry-based methods may only be suitable for scans with extensive surface geometry details.In contrast, clouds that lack geometric details necessitate the analysis of texture/colour information for precise registration.Therefore, the object's colour and shape must be considered.Overlapping regions in data can also pose challenges, especially when the overlapping region of processed point clouds is too small for the used method to handle effectively.Finally, a photogrammetric network can achieve good registration results based on the marked control points [37].However, that method is not suitable for CH objects because, in most cases, it is impossible to distribute those points on the object's surface.The solution to this disadvantage may be using a feature-based registration approach based on automatically detected key points, also known as local features.
The registration workflow based on local features is a multi-step process that involves key point detection [38], descriptor calculation [39], correspondence calculation based on the descriptor matching approach and geometrical verification that allows obtaining reliable tie points [40].The feature-based methods extract 3D key points from the object's surface-reliable and stable points for effective description and matching.There are many algorithms for detecting these features, and among the most commonly used we can include SIFT (Scale-invariant feature transform [41,42]), SURF (Speeded up robust features [43,44]), ISS (Intrinsic Shape Signatures [45]), and Harris3D [46].Based on local or global neighbourhoods, the point descriptors assign characteristic values (changes in grey degree gradients, colour or shape) to the detected key point.The algorithms used to calculate descriptors vary in execution time and accuracy, and their effectiveness depends on the character of input 3D data.Some rely on the geometry of the object's surface, some on colour, and some on both features simultaneously [47].The popular descriptors, namely: Point Pair Feature Histograms (FPFH) [48], Spin Images [49], and Signature of Histograms of OrienTations (SHOT) [50], are used for describing the local features and further used for the finding correspondence points in matching step.The point clouds' key points and descriptors can be initially integrated based on the 6-degree of freedom transformation and RANSAC [51] searching correspondence phase.After that step, the ICP algorithm allows for achieving fine registration results.
Another group of methods used in point clouds registration are those based on a deep learning-based approach, and with their development, accuracy and efficiency in this area has developed [52].Furthermore, there is a rising trend in sharing publicly available datasets designed for machine learning applications in the CH field [53,54].One of the significant advancements in deep learning point cloud processing is PointNet [55] because it provides a unified architecture that can directly take point clouds as input.The basic architecture of PointNet is straightforward, where each point is processed independently and identically in the initial stages.The point is represented by its three coordinates (x, y, z), and other layers can be added by computing normal vectors or additional features.PointNetLK [56] adapts the Lucas and Kanade (LK) algorithm to work with PointNet and unrolls it into a single recurrent deep neural network.This allows for global feature alignment and demonstrates strong generalisation across shape categories while maintaining computational efficiency, but it is not robust to noise.In [57], authors propose a DeepGMR registration method that combines Gaussian Mixture Model (GMM) with neural networks to extract pose-invariant correspondences between raw point clouds and GMM parameters.This method estimates the correspondence between all points and all components in the latent Gaussian Mixture Model (GMM).It performs well across challenging scenarios, such as noise and unseen geometry.The DCP (Deep Closest Point [58]) is an algorithm that utilises a DGCNN (Dynamic Graph Convolutional Neural Network [59]) network to learn correspondences and a differentiable Single Value Decomposition (SVD) method for registration.It encodes point clouds into a high-dimensional space using PointNet or DGCNN and uses an attentionbased module to capture contextual information.The method employs a differentiable SVD layer to estimate the alignment.The DCP has shown promising results on shapes not encountered during training.However, this method assumes an exact one-to-one correspondence between the two point cloud distributions, which may not always hold in real-world scenarios where point clouds may be subject to outliers and other uncertainties, and its performance is hindered by noise.Many state-ofthe-art deep learning registration methods rely solely on geometry information, neglecting texture information.However, some exceptions exist where these methods rely on intermediate media such as RGBD images, projection images, or depth maps [60][61][62].Since deep learning methods typically process only relative positions of points, they lack colour information, which limits their applications.Texture information enables humans to distinguish different parts of a scene.In the context of CH, objects of interest often feature intricate details and rich ornamentation that may have different colours and textures.Therefore, incorporating colour information in point cloud registration can produce more reliable and accurate results.In addition, deep learning methods for CH point cloud registration face certain limitations, such as the requirement for substantial training data and the possibility of overfitting.Furthermore, the current point descriptors based on deep learning are often considered black boxes, lacking a clear understanding of how the original points are processed to generate the final descriptor.

Cultural heritage site
The Wilanów Palace is the only Baroque royal residence in Poland.Construction of this summer residence began at the end of the 17th century and has been repeatedly expanded and modernised.The palace's twostory building hides many relatively small rooms but is characterised by a rich and varied interior design.This makes visiting the residence attractive, and another surprise awaits the visitor behind every corner of the palace corridor.However, this situation also creates severe challenges for the museum.Creating a program of precise three-dimensional documentation of selected interiors is one attempt to deal with these problems.The complex interior layout of the palace complicates providing visitors with access to all rooms.Due to conservation restrictions and security considerations, some rooms can only be accessed by tourists by looking inside through open doors, and sometimes even this form of access is impossible.Creating and providing high-quality digital documentation ensures these magnificent interiors can function in the public domain.Another problem arises from the fact that the Palace, built as a summer residence and characterised by thin walls, now functions as an all-year museum in the harsh conditions of the Polish climate.The inability to lay adequate thermal insulation makes it a significant challenge to ensure appropriate environmental conditions in the palace's interiors at different times of the year.Monitoring the state of preservation of the wall paintings, wooden polychrome ornaments, and other elements of the interior design is an easier task when one has precise spatial documentation that gives the possibility of verifying even minor changes.
The King's Chinese Cabinet and the King's Wardrobe are two rooms in the southern part of the palace used by the King.The third of the rooms that are the subject of this article is the Queen's Antecabinet, located on the other northern side of the palace's central axis and in part used by the Queen.All three rooms have similar dimensions of 4[m] by 4[m] and a height of 5[m] (Fig. 4).
The current decoration of the King's Chinese Cabinet (see Fig. 5) is the work of the workshop of Martin Schnell, who was court lacquerer and painter to King August II the Strong.The wall decoration, created around 1730, is in the form of polychrome wooden panels painted with lacquer and then covered with tiny pieces of silvered copper, which gives the decoration its characteristic glare.The ceiling is covered with a wall painting that relates in theme and colour to the wall decoration but is characterised by much less glare.
From the King's Chinese Cabinet, it is possible to enter the King's Wardrobe (Fig. 6), which has ceiling decoration dating back to the time of King John III and wall decoration made after 1730.Here, too, the walls are covered with wooden panels, into which paintings were created by a group of Saxon artists, who emphasise in their subject matter the connection between the interior of the palace and the surrounding nature of the gardens.
Fig. 4 The floor plan of the palace interiors.Red squares mark the rooms used in this study [63] Fig. 5 The King's Chinese Cabinet [64] The decoration of this room is dominated by light colours combined with a large number of gilded surfaces.
The Queen's Antecabinet (see Fig. 7), whose decoration is dated to 1732, when the Wilanów Palace was used by King August II the Strong.The illusionistic Plafond painting in this room is the work of Jules Poison.The walls are decorated with scenes alluding to Greek mythology, as described by Ovid in his work "Metamorphoses".Thus, we are dealing with wooden wall coverings and inserted panels with oil painting.

Acquisition system
Single 3D point clouds were captured by a custom measurement system designed for the interior acquisition campaign [67], see Fig. 8.The system was designed to emphasise partial acquisition automation [68,69], thus achieving high-quality measurement data regarding resolution, accuracy, and colour.The high resolution was crucial for analysing cracks, scratches, and other imperfections in specific object parts.
The structured light 3D scanner has a specially designed LED projector and two detectors for shape and colour acquisition.The digital projector has a native resolution of 1280 × 800 pixels and is used for the projection of the fringes onto the object's surface.
The spectral properties of the custom LED light sources have been reviewed and approved by the Conservation Department of the Museum.To ensure wider coverage of the surface being measured, two Point Grey colour cameras with a resolution of 9 megapixels each are mounted on the left and right sides of the projector.A 3D scanner was mounted on an industrial robot arm to support partial automation of measurements.The colour was captured using six images with different directions of light illumination to remove the specular component.Two illuminators were mounted on the measurement head and four on the robot base.The robot arm was mounted on a vertical lift, capable of reaching up to 5.5[m].The system was based on a stabilised platform with trolley wheels for easy movement.

Data
The number of point clouds captured during a single room 3D digitisation is massive (around five thousand for each room).Each point cloud contains approximately 7.5 million points (see Fig. 9).The geometrical uncertainty of the point clouds is lower than 0.05 [mm], with an average point-to-point distance of 0.1 [mm].Every point in the point cloud is represented by its 3D coordinates, normal vector, and calibrated colour values (R, G, B).
The dataset used in this paper is a subset of the captured cloud of points, and it has been divided into four subcategories (Fig. 10).The first one comprises point clouds with high-detailed geometry that accurately represents the surface shape (Fig. 10c).The second one consists of planar point clouds characterised by various colours, primarily representing paintings and artworks (Fig. 10d).The third subcategory combines the previous two characteristics, featuring decorative paintings on curved ceilings (Fig. 10a), and the final most challenging subcategory comprises a room fragment containing numerous gilded and shiny decorations (Fig. 10b).However, it should be emphasised that high levels of measurement noise distinguish point clouds belonging

Method
Figure 12 presents an overview of the FAMFR workflow.
To obtain an accurate final model, several algorithms were proposed and developed.Combining them enabled the accurate registration of the point clouds.

Preprocessing
To speed up the registration process without decreasing accuracy and reduce storage requirements, point clouds were uniformly sampled by a factor N sim .This factor is equal to the ratio of the number of points before sampling to the number of points after sampling.After sampling, an average point-to-point distance D avg is calculated.It will be used as a parameter for subsequent algorithms.Next, two metrics were calculated for each point cloud, stable and transformed ( P s and P t ): vector of point pair features V s and gradient of RGB intensities I g .The first is calculated as follows: for each point, p, a neighbourhood sphere with a radius equal to R n • D avg in a cloud around the point of interest is created.Then, three angular values are calculated: V s (α, β, γ ) .It is described by the dis- tance between neighbouring points p n around p and the relative angles of normal vectors associated with those points n and n n according to Fig. 13 and formula 1.
The intensity gradient I g at a given point p is a vector orthogonal to the normal vector n with the direction of the maximum gradient in the local intensity.The magnitude of the vector indicates the gradient of (1)  The coefficients used in the formula (0.299, 0.587, 0.114) are based on the luminosity function, a mathematical model of the human eye's sensitivity to different wavelengths of light [70].An average coordinate p avg is calculated from all the points p n inside the sphere.The next step is determining the average luminosity L avg of all neighbouring points p n .Then, luminosity difference L d is calculated between average value L avg and L n for each point p n (see formula 3).
Finally, each of the neighbouring points p n is modified according to the p avg coordinates using the formula 4.
The matrix A and vector b are created according to the sum of neighbouring points coordinates; see formula 5.
Once constructed, the matrix undergoes QR decomposition using the Householder method [71].As a result of decomposition, the vector x is obtained.The intensity gradient I g is formed from three values I g (L x , L y , L z ) according to the formula 6.
where Identity denotes the identity matrix and n normal vector, visualization of those two metrics describing  (5) the colour and shape of the point cloud is presented in Fig. 14 in the form of vector magnitudes.

Key points evaluation
The two vectors mentioned in the previous subsection allow identifying key points for the rough registration of point clouds.At first, from P s and P t point clouds, two individual subsets, respectively P ss and P ts , of evenly dis- tributed points with equal distance K d between them are selected.Next, key points are filtered based on the previously calculated |V s | and |V g | values.Subsets of points P ss and P ts are filtered to retain only those whose value is greater than the threshold of T h of the maximum value of the magnitude vector from the entire point cloud ( max(|V s |) or max(|V g |) ).As a result, the key points were classified into two distinct categories based on their potential for the registration process.The first category involves points used for registration utilising the point pair feature vector.The second category consists of points intended for registration using intensity gradients.The whole process is presented in Fig. 15.

Feature histogram
A histogram of angular values ( θ ) in a given neighbourhood is determined for each selected point from P ss and P ts subsets independently.The feature is generated from the neighbourhood by calculating the angles between the feature vector ( V s or V g ) at each point p s , and the vector formed by the characteristic point p s and its neighbouring points p sn .This neighbourhood is formed as a sphere with user-defined radius parameters separately for shape H s , and gradient H g .These radii are represented as the multiplication of D avg value.The procedure is illustrated below through a Fig. 16 and a formula 7.
All calculated angles are assigned to the corresponding bin in the histogram, which is pre-divided into a fixed number of cells H b .If the calculated angle for a specific neighbouring point p sn is within the given cell boundaries, then it is increased by the value of the gradient divided by the distance from key point p sn ( d s = |p sn − p s | ).Finally, after all the angles are allocated to their respective cells, the histogram is normalised using the number of neighbouring points.Figure 17 presents key point examples with histograms.

Correspondence evaluation
The obtained histograms are used for identifying corresponding key points pairs between two point clouds via similarity coefficient S. The similarity coefficient is defined using a formula 8, where f p s (i) and f p t (i) are values of histogram bins from stable p s and transformed p t key points.
A lower similarity coefficient score indicates a better match between the points.A fixed number of matched point pairs is selected with the lowest similarity values to filter out an initial set of matched point pairs.A userdefined parameter determines the number of selected point pairs N p .
A rigid transformation between point clouds requires three different, non-collinear point pairs.From N p set, all possible triplets are formed.To remove correspondence outliers spatial consistency of the triplets was analysed.Each of those triplets goes through a filtering process based on user-defined triangle similarity parameter T s .It is a geometric concept describing the relationship between two triangles of similar shapes and sizes.All (7)  For each formed triplet, a 3D transformation is computed and applied to the P t point cloud, and matching quality is evaluated.The registration quality depends on two error metrics: correspondences recall and the similarity between gradient or shape vectors.The first value eliminates incorrect transformations that lead to a small overlapping region after registering two point clouds using said transformation.A specific number of final triplets are selected based on the user-defined parameter N T and the lowest possible combined similarity error C s .
In the next step, the local feature vectors based on each point pair's shape or gradient vector in the final triplet's set are compared to determine optimal correspondences.These correspondences are then utilised to estimate a rigid transformation using the Umeyama algorithm [72].
The final step involves using the ICP algorithm for fine registration.In the shape-based registration approach, the closest point based on distance is selected for each iteration of the algorithm, while in the gradient-based approach, the similarity coefficient between gradient vectors is used.

Experiment and results
The FAMFR workflow was tested using a computer with 64 GB RAM, Intel Core I7-8850H 2.60 GHz CPU, and NVIDIA GeForce Quadro P1000.

Parametrisation of control parameters
The proposed method incorporates a small set of control parameters (see Table 1), which are critical in determining the overall performance of the FAMFR workflow.These parameters are inherently intuitive and should be configured based on the specific characteristics of the data.A recommended approach for parameter   selection is to experimentally evaluate them based on a representative subset of point clouds from the CH object.Subsequently, a comprehensive evaluation of the efficacy of the selected parameters should be conducted on a larger sample of data or the entire dataset to verify the generalisation capacity of the user-defined parameters.It is possible to make the parameters dependent on the average distance between points as was described above to make it more user-friendly and to generalise this method to be used in the future across varying datasets.This significantly improves the method's accessibility and usability.
The first parameter, H b represents the number of histogram bins.The value has the least impact on the whole process, and its value was determined experimentally.The number of histogram cells used in this study equals 18.
The following parameter, the sampling factor ( N sim ), strongly relates to the data the user wants to integrate.This parameter should be given a higher value for dense point clouds to avoid excessive computational time.When dealing with smaller point clouds, a higher simplification factor can lead to the loss of crucial details and characteristic features, thereby hindering the registration process.The value of this parameter used in this study was set to 25.It results in the average point cloud with approximately 300,000 points and an average distance between points D avg = 0.7 [mm].
Parameter R n defines the radius value that forms a neighbouring sphere.This sphere is used to estimate two feature vectors for each point.The R n value influences the precision of the calculated features.Decreasing the value enhances the detection of fine details, affecting the accuracy of the final alignment of point clouds.A greater value allows for more effective determination of cloud fragments for key points detection.See Fig. 18 for different parameter value results.The parameter value was set to R n = 7 , 7 • D avg ≈ 5 [mm] during the experiment.
Accurately identifying correspondences is the key aspect of the process, as it determines the initial registration of point clouds.Therefore, the main focus should be selecting the parameters responsible for this stage.The main parameter is the point-to-point distance K d used to select the subset of points.Setting this value too low substantially increases the time required for determining correspondences.The number of potential key points increases, thus the number of calculations needed to evaluate the histograms and their subsequent comparison.It takes around 266 [s] to initially evaluate correspondences from histograms for point clouds used in this study and K d = 2 .Although the number of correct corre- spondences, in that case, is very high and equals 455.See On the other hand, excessively high values of this parameter may result in selecting points that lack descriptive features and are not sufficiently unique to  Another parameter closely related to key point estimation is T h .It helps eliminate outliers and points that do not contribute enough shape or colour information to the registration process.A lower value of T h may result in selecting key points with insufficient and not unique feature values, which could introduce false positives in the point cloud registration.On the other hand, a high T h value may eliminate potentially correct correspondences, leading to a decrease in the accuracy of the registration workflow.T h value was set experimentally to 0.2 in this study.
The radius parameters H s /H g are crucial in establish- ing correct correspondences between point clouds.Their values significantly influence the descriptiveness of the selected key points.In addition, higher radius values are related to higher processing times.Nonetheless, setting too high or too low values can elevate the number of outliers and hinder the registration process.Thus, a trialand-error approach on a small subset of data was chosen to obtain an optimal selection of radius parameters.This approach entails selecting different values of the radius parameters and evaluating which values provide the highest number of correspondences.In this study, both parameters were evaluated separately using 25 point cloud pairs; see Fig. 21 for shape histogram, and Fig. 22 for gradient histogram.
The effectiveness of finding the correct correspondences improves with an increase in the radius parameter.However, this improvement is connected with the increased computational time required to execute the algorithm.This trend holds to a certain point, where in the case of H s number of the inliers stabilises, and in the case of H g , it starts to decrease.Parameter N p and N T parameters both play a similar role.Limit- ing the number of point pair correspondences ( N p ) or considering a much fewer number of triplets ( N T ) can significantly reduce the computation time.In some cases, simply selecting the best match based on the similarity of the histograms may not result in an accurate registration.Therefore, evaluating several candidate correspondences is crucial to minimise false positives.A reasonable

Evaluation criteria
Reference or ground truth values were needed to compare the registration methods fairly.First, each point cloud pair was registered manually, and then fine-registration was done using the ICP algorithm.Finally, reference 3D transformations have been obtained.Further, 1024 control points were manually and uniformly selected on the reference point cloud.Each point's corresponding point on the transformed point cloud was identified and marked if such a point exists (Fig.
The selection of control points was carefully considered to ensure uniform distribution across the point cloud.
The ground truth correspondences were used to calculate reference values of the similarity coefficient between shape and gradient features.The efficiency of an algorithm is evaluated as the percentage of correct point pairs found in the transformed point cloud, namely recall correspondences Recall c .A match is considered correct if the ground truth control points in the transformed cloud are within a certain distance from the corresponding points in the stable cloud.The more correct point pairs are found, the more effective the algorithm is considered.
Further, the two similarity coefficients are estimated between the control points' shape and gradient feature values in the transformed cloud.The average feature values are approximated from neighbouring points below a certain distance in the stable cloud.Root mean square error is estimated from the distance between transformed control points and average coordinates formed from neighbouring stable points.

Experiment
To verify the effectiveness of the FAMFR workflow for point cloud registration, a comparison with the stateof-the-art feature-based and deep-learning-based (DL) methods was made.The performance of these methods was evaluated on a dataset consisting of 100 pairs of point clouds, with 25 pairs from each of the categories described in the Materials section.The accuracy of each method was evaluated based on the ground truth control points and similarity coefficient between feature vectors ((S Vg for gradient feature vector and S Vs for shape feature vector), as described in the Evaluation criteria subsection (according to formula 8).As for feature-based methods, Point Clouds Library (PCL) [73] implementation was used.The evaluation was done based on three different features: FPFH [48], PFHRGB [74], and RIFT [75].FPFH algorithm was chosen because of its known and proven effectiveness in the registration process.PFHRGB and RIFT algorithms were employed due to their ability to include colour information in the registration process.The point clouds used in the study consist of intricate decorative features and colour variations.The 3D transformation for feature-based methods was independently estimated using a traditional registration workflow for each feature.Key points were selected using the SIFT3D algorithm, followed by feature evaluation.Next RANSAC algorithm was used for correspondence estimation, outlier rejection and initial transformation estimation.The process was finished with fine registration using the ICP algorithm.In the case of DL methods, five of them were evaluated.For DCP, PointNetLK and DeepGMR implementation and models, the open-source library Learning3D [76] was used.Models were pre-trained using ModelNet data set [77].Regarding GeoTransformer [78,79] and Predator [80,81], the repositories with the official implementation of the papers were used.

Results
The results of registration comparison for different methods are presented in a structured manner, beginning with a breakdown of the performance of all algorithms across four distinct data types.The first scenario involves a set of point clouds with complex geometry (Table 3).
The second one consists of planar clouds (Table 4) with reach colour decorations.The third one represents decorative paintings placed on curved ceilings (Table 5), and the fourth scenario represents point clouds with numerous golden ornaments (Table 6).Subsequently, a Table 7 is included to present the average values of similarity coefficients ( S Vg and S Vs ) of all the methods against the ground truth, with the computation times required by each algorithm for single point cloud pair registration.Additionally, in Table 7, column with parameter Recall C is expanded to compare with our method FAMFR.

Discussion
During the experiment, there were several challenging situations encountered.One is the known problem of registering point clouds with low overlapping regions.As shown in Fig. 24, the common part in this specific case was very small.Additionally, this fragment was not very characteristic in terms of geometry and colour, making it challenging to register such point clouds correctly.Despite these difficulties, the proposed method could accurately register these two point clouds.
Another challenging registration scenario reprises the cloud with a characteristic and partly reflective surface.It creates distinctive regions with noise-like features, which may easily lead to many outliers correspondences,    see Fig. 25.This study included 4 cases of this issue in the testing data sample.The proposed method was able to register 3 out of the 4 correctly.Scenario 4 was the most challenging data used in this study (Table 6).The presence of numerous gilded and shiny decorations in the point cloud data of this subcategory makes the point cloud registration process challenging due to the high levels of measurement noise.This issue can take many forms, such as poor surface or colour reconstruction and the accumulation of large amounts of measurement noise creating nonexistent surfaces or geometry, see Fig. 26.
As shown in Table 7, FAMFR has outperformed all other algorithms tested in this study on this particular dataset.The first subcategory, which includes featurebased methods, often shows failure.Although featurebased matching has the advantage of requiring only a 3D model of the object, the calculation and matching processes are computationally demanding.
All feature-based methods were evaluated using several user-defined parameters for a fair comparison.The SIFT3D algorithm was configured to detect three  different numbers of key points, namely approximately 800, 3500, and 6000.Similarly, descriptor parameters were varied to include different radius values, specifically 5, 10, and 20 for FPFH and RIFT.However, due to the extensive time required (30 min per single cloud) to compute the PFHRGB feature, a constant radius value of 5 was employed for this descriptor.This approach allowed for a more rigorous and robust evaluation of the methods and helped to identify the optimal parameter settings for a fair comparison.
To avoid excessive complexity in the presentation of results, we have only reported the average estimation times and registration errors for the parameter settings that yielded the best outcomes.
As expected, the best performance was achieved using the PFHRGB feature, which incorporates geometry and colour information.The FPFH algorithm, which only considers shape information, achieved lower scores.This is due to the inability of the algorithm to extract enough distinctive features to identify the transformation of the clouds in the case of poorly shaped objects with planar or constant-curvature surfaces and regular angles.The RIFT algorithm, which describes a given point based on its spatial neighbourhood of 3D points and the corresponding intensity gradient vector field, fails to register for many point cloud pairs cases.This is likely due to texture and colour interference errors caused by flares and specular reflection.Moreover, all the mentioned feature descriptors are also affected by the errors from the key point calculation process using the SIFT3D algorithm.One cannot be certain that there will be corresponding points between two point clouds.There may be many outliers, which can lead to erroneous transformation estimation.It should be noted that all of these feature-based methods have certain drawbacks.They are mathematically complicated, computationally heavy, and sensitive to parameter tuning, which requires considerable expertise to identify the optimal parameter values for a specific dataset.In most cases, they are handcrafted for specific datasets.
DCP failed to register every point cloud pair in this study due to its reliance on exact point-to-point correspondences, which are not always available in realworld scenarios.Additionally, noise in the data hindered its performance because it relies on the complex model design, and extracted local features are especially sensitive to noise.
PointNetLK, trained in feature detection for specific object categories, failed to recognize valuable features in the point clouds used in this study.During registration, it could easily fall into local minima.DeepGMR, on the other hand, estimated correspondences between all points and all components in the latent Gaussian mixture model (GMM), making the registration result invariant to the magnitude of transformation or the density of the input geometries.However, the method assumes a perfect match between the two point cloud distributions, invalid for the point clouds used in this study, where outliers and other uncertainties are present.
GeoTransformer and Predator required rescaling of the data because of significant memory usage and computational cost.The official implementation of these methods did not work as intended and threw errors during registration.GeoTransformer, based on poseinvariant features, achieved much better results than previously described deep learning methods.The method employs learned geometric features to facilitate robust superpoint matching and encode pair-wise distances and triplet-wise angles, which makes it more reliable in low-overlap cases.GeoTransformer relied on uniformly downsampled superpoints to extract correspondences hierarchically.However, the hyperparameter for controlling the sensitivity of distance and angular variations must be selected precisely for different datasets.
Predator, a neural architecture for pairwise 3D point cloud registration, learns to detect the overlap region between two unregistered scans and to focus on that region when sampling feature points.This method is designed for pairwise registration of low-overlap point clouds and relies on sufficient superpoints.Although Predator has limitations in scenarios where the point density is very uneven, its ability to prioritize points relevant for matching has been shown to enhance performance.Predator achieved the best overall results in the experiment compared to other deep learning methods.
One common limitation of current deep networks is that they can only handle object-level point clouds.Testing on whole objects or sites requires designing an efficient scene-level point cloud encoding network or rescaling and downsampling point clouds.
Regarding algorithm execution time, the proposed method shows improved results compared to the tested feature-based methods, although it falls behind deep learning methods in speed.However, this is the case when we do not consider the time needed to train the network and whether the selected training data is enough to get accurate results.The duration of training in a typical deep learning method can vary greatly depending on many factors, such as the complexity of the network, the amount of data used for training, the type of task being performed, and the available computational resources.Training a deep learning model is an iterative process, and it may take multiple rounds of training, evaluation, and hyperparameter tuning to achieve optimal results.Training can take only a few minutes in some cases, while in others, it can take several days or weeks.State-of-the-art models used in computer vision and natural language processing tasks can require training times of several days or even weeks on powerful hardware.Therefore, providing a specific training time that applies to all scenarios is difficult.For deep learning methods, the focus is often on the performance metrics of the model, such as accuracy or F1-score, rather than the training time.The assumption is that the training time is not a significant factor in evaluating the quality of the model since it is a one-time cost and does not affect the performance of the deployed model.
As stated before, point clouds can differ across various cultural heritage objects.If the object is unique, and you need to register another significantly different object, you may need to create a new training dataset.Preparing a training set with attention to the details from rich and varied interior design is challenging for those specific CH interiors used in this study.The ability of deep learning methods to perform well on new data, which has not been processed during training, is often limited.Many of these methods result in poor performance when dealing with such cases.This is because the learned model may not generalise well to new and different objects and may not accurately register the new object without additional training data.
Existing implementations of deep learning registration can be challenging to use.They are often designed for researchers or advanced users familiar with deep learning methods' intricacies.These methods often involve complex neural network architectures and require large amounts of data for training.In addition, deep learning registration methods are relatively new compared to traditional 2D image-based methods, and there is still ongoing research in the field.As a result, implementing these methods can be more challenging than traditional methods, which have been around for much longer and have more established workflows.Furthermore, deep learning methods often require significant computational resources, which can add to the complexity of using these methods.
Despite these challenges, there are efforts to develop more user-friendly implementations of deep learning registration methods.However, it is essential to remember that deep learning methods are not always the best solution for every registration problem.Traditional, feature-based methods may still be more suitable in certain cases.

Conclusion
The data used in this study comprises various subcategories of CH object point clouds with complex and unique characteristics.This highlights the need for a robust and efficient point cloud registration algorithm to handle different data types with varying degrees of complexity.The FAMFR workflow addresses these needs by leveraging two distinct features: V g , which incorporates intensity gradient information, and V s , which describes the geometric relationship between adjacent points and their normal vectors.The utilisation of both features enables the successful rough alignment of point clouds.The experimental results validate the efficiency of the FAMFR workflow in all examined scenarios, achieving an improvement of approximately 80% over traditional, feature-based methods and approximately 35% over deep learning-based methods (see values in brackets in Table 7).
Despite the promising performance of FAMFR in the 3D point cloud registration, certain limitations and weaknesses were identified.The most demanding challenges were observed in scenario 4, with reflective surfaces, as demonstrated in the discussion section and illustrated in Fig. 26.While the proposed methodology partially mitigated these issues, FAMFR yielded comparatively lower results than other scenarios.Another limitation was observed on pair of scans with low overlapping regions (Fig. 24).Finding correct correspondences is challenging, especially when regions have limited colour or geometric information.The lower number of uniformly sampled potential key points with insufficient colour or shape information may result in incorrect point cloud registration.
Point clouds with partially reflective and noise-like textures also create many false positive correspondences (Fig. 25).When not correctly detected and filtered out, they may be qualified as valid correspondences and lead to an incorrectly determined initial 3D transformation.These limitations should be considered when applying FAMFR to other datasets and scenarios.Further research is needed to address these challenges to improve the robustness and versatility of the proposed methodology.
Despite those challenges, FAMFR proved to be an effective high-resolution point cloud registration workflow for CH interiors.It allows for quick and effective registration of point clouds, significantly facilitating the creation of large 3D models of CH objects.Additionally, the small number of input parameters makes it easy to use and ready to add to existing registration workflows.With proper data preparation, fine registration using the ICP algorithm and final relaxation, one can create a large, detailed, high-resolution representation of a real CH object, see Fig. 27.
In addition to testing other data types, future work can also involve evaluating the proposed method on larger datasets.This can help assess the scalability and robustness of the FAMFR workflow.Another potential avenue for future research is to optimise the code for faster execution times.This can involve exploring execution parallelisation, optimising memory usage, and more efficient data structures.Furthermore, the features used in this study can be used to train a deep-learning model for correspondence estimation.This can potentially improve the registration process's accuracy and robustness by leveraging deep learning models' ability to learn complex feature representations proposed in this study.The trained model can also be used for transfer learning on new datasets with similar characteristics, saving time and effort in feature engineering.

Fig. 2 3
Fig. 2 Two-stage registration process; a input pair cloud, b initial registration, c fine registration and d final model Fig. 3 Registration method classification

Fig. 9
Fig. 9 Point cloud examples in comparison to the entire King's Chinese Cabinet room

Fig. 10
Fig. 10 Four different point cloud examples: a ceiling point clouds, b shiny/glided point clouds, c point clouds with rich shape, d point clouds without shape variations

Fig. 15
Fig. 15 Key points evaluation process on the geometrical features (right) and gradient features (left).a input cloud, b evenly distributed subset of points, c all potential key points split into two categories, utilising geometrical or gradient features, d key points filtered using threshold T h , e) Final key points used for registration

Fig. 17
Fig. 17Four different key points with histogram examples Fig. 17Four different key points with histogram examples

Fig. 19 .
Fig. 19.The value was set to K d = 7 , 7 • D avg ≈ 5 [mm] throughout the experiment.On the other hand, excessively high values of this parameter may result in selecting points that lack descriptive features and are not sufficiently unique to

Fig. 18
Fig. 18 Data visualisation for different R n parameter values

Fig. 19
Fig. 19 Relation between parameter, time and correspondences found.The time axis is on a logarithmic scale H s was set to value H s = 30 , 30 • D avg ≈ 20 [mm] and H g = 15 , 15 • D avg ≈ 10 [mm] in this experiment.

Fig. 20 Fig. 21 Fig. 22
Fig. 20 Correspondences result for different values of K d parameter

Table 1
Method control parameters h Threshold used to key point filtration process H s /H g Sphere parameters are used to estimate the histogram of features, separated for shape and gradient N p Number of selected point pairs used to form triplets N T Number of formed triplets used in transformation estimation

Table 2
Method control parameters values Fig. 23 Control points marked on the overlapping region between stable and transformed point clouds

Table 3
Registration results for scenario 1

Table 4
Registration results for scenario 2

Table 5
Registration results for scenario 3

Table 6
Registration results for scenario 4

Table 7
Average registration results.Comparison with FAMFR in column Recall C (values in brackets)