Fusion of LiDAR point cloud and image data based on scale consistency
The core problem of digital protection of architectural heritage is to build an exemplary geometric model of origin. The existing data acquisition methods are mainly ground 3D laser scanning and closerange photogrammetry. Their data results have their advantages and disadvantages. It is difficult to carry out detailed 3D reconstruction alone. First of all, we should solve the effective integration of LiDAR and unmeasured digital image data. In this project, the principle of computer vision is introduced. The SfM algorithm is used to realize the transformation from image to 3D point cloud. Through image feature detection and matching, the position of the 3D position of camera is obtained by the SfM algorithm. Then the precise sparse point cloud is obtained by global optimization with bundle adjustment method. Finally, the dense point clouds are obtained by overall optimization of internal and external parameters and encryption of point clouds. Then the registration between the point cloud generated by the image and the ground LiDAR point cloud is completed according to feature matching to achieve the consistency of scale, thus achieving the effective fusion of LiDAR and digital images. The overall workflow is shown in Fig. 2.
Image matching
The SfM algorithm is an offline algorithm for 3D reconstruction based on various collected unordered images. Before proceeding to the core algorithm structurefrommotion, some preparatory work is needed to select suitable images.
Firstly, the Scaleinvariant feature transform (SIFT) feature detection operator extracts features from images, and matching between them is performed. To speed up the matching, a KDimensional (KDtree) is established for feature descriptors. An Artificial Neural Network (ANN) optimization search algorithm is used to find the matching relationship of feature points for each image pair (I, J). The matching points are added to the candidate matching points set to participate in the subsequent operation. However, there may still be mismatching among the candidate matching points, so the Random Sample Consensus (RANSAC) algorithm is used to estimate the fundamental matrix robustly. The essential matrix is used to filter the matching points, thus obtaining better matching points. By counting the number of feature matches between image pairs, the image pair with the most significant digit is selected as the initial image pair. Then, the essential matrix between the initial image pairs is estimated, and the relative pose is solved by matrix decomposition. Thirdly, threedimensional points are constructed by triangle intersection. Finally, a beam adjustment is performed to optimize the relative posture of the initial image pair and the obtained threedimensional points.
3D point cloud reconstruction

Add a new image from the remaining photos, find the 2D and 3D corresponding points of the new image by matching points with the second image, and solve its projection matrix p. The pose of the new image is obtained by decomposing the matrix P, and the latest 3D points are reconstructed by triangle intersection with the second image. Finally, the initial image pair and the newly added image are optimized by beam adjustment.

Repeat the above step until all photos are added to the reconstruction process to obtain sparse point clouds. Then dense point cloud is obtained by a dense matching algorithm.

Given the lack of point clouds, we select control points from the dense point clouds generated by images and LiDAR point clouds to complete rough registration and acceptable registration through feature matching. In order to compensate for the uneven density of point clouds, coarse registration and accurate registration are automatically ended to obtain fused point cloud data and achieve the purpose of image and laser point cloud fusion. The specific method is shown in Fig. 3.
Global point cloud registration algorithm with multiple feature Constraints
The iterable global registrment can be divided into three processes: data pre−processing, solving the initial parameters and global leveling. The specific technical route is shown in Fig. 4. The first step in processing point cloud data is to perform point cloud denoising, here is a bilateral filtering algorithm. Bilateral filtering algorithm is widely used in point cloud noise processing due to its simple, non−iterative, local characteristics and good edge retention. Bilateral filtering can be defined as follows:
$$\widehat{{p}_{i}}={p}_{i}+\lambda {n}_{i}.$$
(1)
p is a point in the point cloud data to be processed, n is the normal vector of the point, λ is the bilateral filter factor, and the calculation formula of λ is as follows:
$$\lambda =\frac{\sum_{{P}_{j\in {N}_{k}}\left({P}_{i}\right)}{W}_{c}\left(\Vert {p}_{j}{p}_{i}\Vert \right){W}_{s}\left(\Vert {<n}_{j},{n}_{i}>\Vert 1\right)<{n}_{i},{p}_{j}{p}_{i}>}{\sum_{{P}_{j\in {N}_{k}}\left({P}_{i}\right)}{W}_{c}\left(\Vert {p}_{j}{p}_{i}\Vert \right){W}_{s}\left(\Vert {<n}_{j},{n}_{i}>\Vert 1\right)}.$$
(2)
Among them, \(W_{c}\),\(W_{s}\) represent the spatial domain and frequency domain weight functions of the bilateral filter function, and \(\left\langle {n_{i} ,p_{j}  p_{i} } \right\rangle\) is the inner product of `\(n\) and \(p_{j}  p_{i}\).
As many registration stations tend to cause error accumulation, the accuracy of stationbystation registration will become increasingly lower. The registration quality of point cloud data is related to the accuracy of the subsequent overall. Multifeaturebased global registration is generally adopted for 3D scanning data of large and complex scenes. Firstly, available features in the point cloud are extracted for registration. A local stationbystation coarse registration provides the initial value parameters. Start from the base station and search outward for neighboring feature points with the same name. The cloud of each station is registered to the base station through the Rodrigues matrix, and the base station is gradually expanded outward. The rotation matrix of each station and the coordinates of the same name point are calculated as the initial value parameters of the overall adjustment. The initial value of the characteristic constraint is taken as the error equation of the observation value series, and the overall adjustment is carried out. The bundle adjustment model solves the spatial transformation parameters and unknown point adjustment values. The error of each constraint is checked, and when the error is less than the specified threshold, the registration result is output. If the error is too large, the weight of each constraint is recalculated through the weight function. The weight of the observation value is continuously revised in the iterative process until the accuracy requirement is met, the iteration is stopped, and the registration point cloud is output.
Local coarse registration utilizes the characteristic constraints between the base station and the registration station to perform stationbystation registration using the Rodrigues matrix. The Rodrigues matrix idea builds the coordinate transformation model using three antisymmetric elements instead of Euler angles. The parameters are solved separately, and the scale parameter is calculated first, then the rotation parameter, and finally the translation parameter. The antisymmetric matrix S, composed of the 3 independent parameters, constructs the Rodrigues matrix as follows:
$$R={\left(IS\right)}^{1}\left(I+S\right).$$
(3)
where I is the unit matrix, and S is the antisymmetric matrix composed of parameters a, b, and c.
$$S=\left[\begin{array}{ccc}0& c& b\\ c& 0& a\\ b& a& 0\end{array}\right].$$
(4)
Feature constraints can be points, lines, and surfaces. In the experiment, we use the center of the target paper as the feature and list the point error equation for the solution. According to the principle of coordinate transformation, three pairs of homonymous points in space that are not on a straight line can be solved for the spatial transformation parameters. The two stations' homonymous characteristic points \(X_{0} = \left( {x_{0} ,y_{0} ,z_{0} } \right)\) and \(X = \left( {x,y,z} \right)\), have the following relationship.
$${X}_{0}\left(\lambda RX+\Delta X\right)=0.$$
(5)
where \(\lambda\) is the scale parameter, the scale is constant in the point cloud transformation, i.e., \(\lambda\)=1, \(\Delta X\) is the offset.
The characteristic constraint in the registration station will lead to poor accuracy of the overall leveling once there is a significant observation error. A selective−weight iterative method attenuates or eliminates the effect of coarse deviations. After checking the observation errors, the observations that exceed the threshold are reweighted using posteriori variance−based selective power iterative method weight function, as in Eq. 6.
$$P_{i,j}^{v + 1} = \left\{ \begin{gathered} p_{i}^{v + 1} = \frac{{\mathop {\sigma_{0} }\limits^{{\Lambda^{2} }} }}{{\mathop {\sigma_{i} }\limits^{{\Lambda^{2} }} }},T_{i,j} < F_{a,1,ri} \hfill \\ \frac{{\mathop {\sigma_{0} }\limits^{{\Lambda^{2} }} }}{{\mathop {\sigma_{i,j} }\limits^{{\Lambda^{2} }} }} = \frac{{\mathop {\sigma_{0} }\limits^{{\Lambda^{2} }} r_{i,j} }}{{V_{i,j}^{2} }},T_{i,j} < F_{a,1,ri} \hfill \\ \end{gathered} \right..$$
(6)
where the test quantity is \(T_{i,j} { = }\frac{{\mathop {\sigma_{i,j} }\limits^{{\Lambda^{2} }} }}{{\mathop {\sigma_{i} }\limits^{{\Lambda^{2} }} }}\), the test quantity \(F_{a,1,ri}\) is generally taken to be 4.13, equivalent to the significance level α = 0.1%, and the test efficacy β = 80%.
Multiscale point cloud fusion
To better obtain abstract representation data features for a fusion of image point cloud and LiDAR point cloud, we use threepart matching metrics to judge the quality of feature selection, and first use Euclidean space distance of feature vector as the first feature distance, as shown in Eq. (7).
$${\mathrm{\varsigma }}_{1}={(\mathrm{p}}_{i},{\mathrm{q}}_{j})=\Vert {v}_{i}^{P}{v}_{J}^{Q}\Vert .$$
(7)
The cosine similarity between feature vectors is used as the second feature distance, as shown in Eq. (8).
$${\mathrm{\varsigma }}_{2}\left({\mathrm{p}}_{i}{,\mathrm{q}}_{j}\right)=\frac{\left({v}_{i}^{P}{v}_{J}^{Q}\right)}{\left({\Vert {v}_{i}^{P}\Vert }_{2}*{\Vert {v}_{J}^{Q}\Vert }_{2}\right)}.$$
(8)
Finally, we use the Gaussian curvature ratio of K nearest neighborhood as the third feature distance, as shown in Eq. (9).
$${\mathrm{\varsigma }}_{3}\left({\mathrm{p}}_{i}{,\mathrm{q}}_{j}\right)=\frac{{\mathrm{g}}_{i}^{P}}{{\mathrm{g}}_{j}^{Q}}.$$
(9)
In the above equation, \({\mathrm{v}}_{\mathrm{i}}^{\mathrm{p}}\) and \({\mathrm{v}}_{\mathrm{j}}^{\mathrm{Q}}\) are the eigenvectors of \({p}_{i}\) and \({q}_{j}\) respectively. \({p}_{i}\), \({ q}_{j}\) are the characteristic points of P and Q, \({\mathrm{g}}_{\mathrm{i}}^{\mathrm{p}}\) and \({\mathrm{g}}_{\mathrm{i}}^{\mathrm{q}}\) are the k neighborhood Gaussian curvatures of \({p}_{i}\) and \({q}_{j}\) respectively. According to the three feature matching conditions defined by feature parameters, the Euclidean distance between feature vectors is the Min \({\mathrm{\varsigma }}_{1}\), the cosine similarity between feature vectors is the Max \({\mathrm{\varsigma }}_{2}\), and the ratio of Gaussian curvature between neighbors is approximately 1, which is \({\mathrm{\varsigma }}_{3}\) approximately equal to 1. The feature point pairs (\({p}_{i}\), \({q}_{j}\)) screened by the matching conditions are preliminarily determined as the corresponding relationship between P and Q. The set \({\mathrm{K}}_{1}\) of feature matching point pairs is generated.
In order to improve the accuracy and computational efficiency of registration, and to effectively eliminate matching point pairs with similar features, the fine matching step with Euclidean distance constraint between point pairs is carried out. In the set K_{1}, the Euclidean distance constraint between point pairs is used to test the point pairs and the distance constraint condition like Eq. (10).
$$\frac{\left{\Vert {\mathrm{p}}_{i}{\mathrm{p}}_{j}\Vert }_{2}{\Vert {\mathrm{q}}_{i}{\mathrm{q}}_{j}\Vert }_{2}\right}{\left{\Vert {\mathrm{p}}_{i}{\mathrm{p}}_{j}\Vert }_{2}+{\Vert {\mathrm{q}}_{i}{\mathrm{q}}_{j}\Vert }_{2}\right}<\varepsilon .$$
(10)
Feature matching
Given the selected image point cloud and LiDAR point cloud features have no scale invariance, we use the geometric combination to screen the mismatching further to ensure the correct matching rate of point cloud features.
At first, three pairs of points marked as (\({h}_{1}\), \({j}_{1}\)), (\({h}_{2}\), \({j}_{2}\)), (\({h}_{3}\), \({j}_{3}\)) are randomly selected from a number of matching pairs of point cloud features. In two point clouds P and Q, triangles \({T}_{P}\) and \({T}_{q}\) are composed of \(\left\{{h}_{1}, {h}_{2},{h}_{3}\right\}\) and \(\left\{{l}_{1}^{j},{l}_{2}^{j},{l}_{3}^{j}\right\}\) respectively, and the three sides of the triangle are \(\left\{{l}_{1}^{h},{l}_{2}^{h},{l}_{3}^{ph}\right\}\) and \(\left\{{l}_{1}^{j},{l}_{2}^{j},{l}_{3}^{j}\right\}\) respectively. Calculate the proportional coefficient of the length corresponding to the side length as shown in Eq. (11)
$${\beta }_{i}=\frac{\Vert {a}_{i}\Vert }{\Vert {b}_{i}\Vert }.$$
(11)
If the side length relation satisfies Eq. (12), the point pairs are added to the matching pair set K, where ξ < 1 is the selected threshold.
$$\upxi <\frac{{\beta }_{i}^{2}}{{\beta }_{i}{\beta }_{k}}<\frac{1}{\xi },\forall \left\{i,j,k\right\}=\left\{\mathrm{1,2},3\right\}.$$
(12)
Global objective function optimization
In the solving process, the components in the point cloud transformation are solved by the global objective function. In order to eliminate the influence of point cloud mismatching caused by noise on the results, a solution optimization function is set. As shown in Eq. (13).
$$\upphi \left(\mathrm{s},\mathrm{R},\mathrm{t}\right)=\sum_{i=1}^{n}\rho (s*R{p}_{i}+t{q}_{i})\uprho \left(x\right)=\frac{\mu *{x}^{2}}{\mu +{x}^{2}}.$$
(13)
In which \(\uprho \left(x\right)\) is a GemanMcClure function with scaling coefficient, which has better noise immunity than the mean square error. The value of parameter \(\mu\) makes \(\uprho \left(x\right)\) function strengthened and weakened by the influence of independent variable \(x\). Matching outliers can be invalidated as outliers.