1Hefei National Research Center for Physical Sciences at the Microscale and School of Physical Sciences, University of Science and Technology of China, Hefei 230026, China
2Shanghai Research Center for Quantum Science and CAS Center for Excellence in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai 201315, China
3School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China
4Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
Xin-Wei Kong, Wen-Long Ye, Wenwen Li, Zheng-Ping Li, Feihu Xu, "High-resolution single-photon LiDAR without range ambiguity using hybrid-mode imaging [Invited]," Chin. Opt. Lett. 22, 060005 (2024)
Copy Citation Text
【AIGC One Sentence Reading】:We developed a hybrid imaging technique that combines single-photon LiDAR and conventional camera data to produce high-resolution depth maps, enhancing resolution and range while eliminating ambiguity in long-range 3D imaging.
【AIGC Short Abstract】:We introduced a novel hybrid imaging approach that combines single-photon LiDAR data with intensity images from a high-resolution camera, enhancing depth resolution and extending the unambiguous range for 3D imaging. This method utilizes a fusion algorithm to process raw data from both sensors, resulting in a detailed depth map. Our approach achieves a remarkable increase in resolution and range, showcasing its potential for long-range, high-resolution 3D imaging without ambiguity.
Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.
Abstract
We proposed a hybrid imaging scheme to estimate a high-resolution absolute depth map from low photon counts. It leverages measurements of photon arrival times from a single-photon LiDAR and an intensity image from a conventional high-resolution camera. Using a tailored fusion algorithm, we jointly processed the raw measurements from both sensors and output a high-resolution absolute depth map. We scaled up the resolution by a factor of 10, achieving 1300 × 2611 pixels and extending ∼4.7 times the unambiguous range. These results demonstrated the superior capability of long-range high-resolution 3D imaging without range ambiguity.
Single-photon light detection and ranging (LiDAR) presents high sensitivity and high temporal precision, which has been widely applied in fields such as topographic mapping[1-3], remote sensing[4], target identification[5,6], and underwater imaging[7]. To meet the application demands, long-range and high-resolution single-photon three-dimensional (3D) imaging has emerged as a significant trend in the development of single-photon LiDAR techniques[8,9]. However, it remains challenging to directly achieve rapid and accurate 3D imaging over a wide field-of-view (FoV) and a large depth-of-view (DoV).
Array-based single-photon LiDAR can be used to achieve high-resolution 3D imaging[10]. However, it needs a high-power laser to flood illuminate the scene. Besides, currently available detector arrays have limited size or show a poor time-tagging performance[11]. Therefore, widely used single-photon LiDAR is typically based on raster scanning[12,13]. But, high-density scanning inevitably leads to a longer imaging time. To mitigate this issue, data fusion techniques have been proposed to merge visible or infrared high-resolution images with single-photon LiDAR data to improve imaging resolution[14-16].
Generally, single-photon LiDAR employs a time-correlated single-photon counting (TCSPC) technique. However, when the target is far away, the photon time of flight (ToF) that extends laser emission periods will be folded, resulting in range ambiguity[17], which leads to difficulties in large-DoV imaging. Several approaches have been proposed to mitigate the range ambiguity. A pseudo-random pattern matching scheme[18-21] can identify the exact flight time by correlation between the transmitted and received patterns. Meanwhile, the multi-repetition-rate scheme has also been demonstrated to increase the maximum unambiguous distance beyond 100 kilometers[22] and achieve large-DoV imaging[23]. Nonetheless, a comprehensive solution to achieve wide FoV and large DoV simultaneously is still lacking.
Here, we proposed and demonstrated a fusion method that simultaneously tackled the range-ambiguity and low-resolution bottleneck of single-photon LiDAR. We integrated a multi-repetition-rate single-photon LiDAR and a high-resolution intensity camera on hardware. On the software side, we developed a tailored fusion algorithm for recovering absolute distance and enhancing the image resolution in the scenario of low photon counts. We experimentally validated the ability to reconstruct high-resolution absolute depth images. We scaled up the image resolution by a factor of 10 by achieving pixels and extended times the unambiguous range. Consequently, our method holistically achieved long-range, high-resolution 3D imaging of expansive scenes with high depth accuracy over a wide FoV and a large DoV.
2. Approach
In single-photon imaging, the system illuminates the target’s th pixel with a periodic laser pulse and then measures the backscattered photons. By recording the time interval between the arrival of the echo signal and the most recent pulse emission, the depth and reflectivity of the target’s th pixel can be estimated. However, when the target is far away, the photon ToF that extends laser emission periods will be folded, resulting in a Poisson-process rate function as follows: where is detector’s photon-detection efficiency, represents the average rate of background-light plus dark-count detections, and is the speed of light. The parameter represents the photon ToF being folded.
After pulsed-illumination trials, the likelihood function for the set of time interval is where , and is the total number of photons detected at the th pixel. Generally, the target distance can be estimated by applying maximum likelihood estimation (MLE):
Because the maximum likelihood estimator is a periodic function of , Eq. (3) has multiple optimal solutions, which prevents a straightforward calculation of the actual distance to the target and causes range aliasing.
To overcome this range ambiguity, we use a data acquisition scheme where adjacent pixels are detected through different laser pulse repetition periods and a data fusion method exploiting images captured by camera. The data acquisition scheme has been extensively detailed in a previous paper[23]. Here, we focus on the use of high-resolution images for absolute distance reconstruction and upsampling of single-photon LiDAR data. The schematic of the algorithm is illustrated in Fig. 1, and the algorithm can be divided into two steps.
Figure 1.Schematic diagram of the algorithm. (a) Single-photon LiDAR data acquired by laser source with multiple repetition rates. (b) Image captured by camera. (c) Intensity image of (b). (d) Absolute distance image. (e) Horizontal, vertical, and diagonal gradient images from the camera image. (f) High-resolution depth image without range ambiguity.
2.1. Resolving range ambiguity guided by the intensity image
Upon acquiring the measurements via the multi-repetition-rate scheme, the integration of data from adjacent pixels within the neighborhood through cluster algorithms[20] enables the determination of the absolute distance: where the weighting factor is used to avoid errors in distance calculation at the edges of objects. Similar to the previous paper[23], we leverage the spatial and reflectivity information to evaluate the weighting factor for neighboring pixels. However, due to the reflectivity map of single-photon LiDAR being susceptible to Poisson noise at low photon counts, we use conventional high-resolution camera images to evaluate the reflectivity information of single-photon LiDAR pixels. Due to the pixel number discrepancy between the conventional camera and single-photon LiDAR, the reflectivity value of the single-photon LiDAR is the weighted average of several conventional camera pixels. A many-to-one pixel mapping scenario arises: where and correspond to the positions and intensities of the conventional camera images, respectively. Therefore, the definition of the weighting factor is . Here, and are the spatial and reflectivity kernels, respectively, both positively correlated with the Gaussian distribution.
Since the above process of solving requires integration of the echo signals from the surrounding pixels, this often results in the image becoming overly smoothed, consequently reducing the imaging resolution and affecting the image quality. Here a convex optimization algorithm is employed to further enhance the accuracy of image reconstruction. The folded photon ToF for the th pixel can be determined as . Then, taking advantage of spatial correlations in natural scenes, we select total variation (TV) as the penalization term. Thus, the absolute depth map is derived as follows:
The above equation constitutes a convex optimization problem and can be solved using convex optimization algorithms[24] to obtain the final estimated distance value of the target.
2.2. Intensity-image guided upsampling
Furthermore, to improve the resolution of single-photon imaging, we can take advantage of the high resolution offered by conventional camera images to guide the upsampling of single-photon images. In our framework, is designated as the high-resolution single-photon depth map we aim to obtain. Correspondingly, the already acquired absolute depth map represents a downsampled mapping of , and this downsampling satisfies the following relation: where is the downsampling function that performs pixel-weighted summation using Gaussian weights, and represents the noise. Assuming the noise follows a Gaussian distribution, its likelihood function can be expressed as follows:
Thus, by applying MLE, we can obtain the high-resolution single-photon image:
Here, we employ a second-order total generalized variation (TGV) regularization as the penalty term to constraint image, which is represented as where is the anisotropic diffusion tensor, is an auxiliary variable, and the scalars and are non-negative weight coefficients. The TGV allows for sharper edge preservation while suppressing noise. Since the problem is convex but nonsmooth due to the TGV regularization term, a primal-dual optimization algorithm is used for solving[14].
3. Simulations
We conducted simulation experiments using the Middlebury 2007 dataset[25] to validate the effectiveness of our proposed method in reconstructing high-resolution absolute distance images. The resolution of single-photon imaging is set to pixels. Considering the depth span of only 6 m in the simulation scenario, we conducted a downscaled simulation of the imaging system’s laser period by a factor of 100. We selected laser periods as 10 ns, 14.3 ns, 15.9 ns, 16.1 ns, and 17.1 ns for the simulation, of which the single period maximum unambiguous range is 2.565 m. As shown in Fig. 2, we reconstructed the depth map using our method and compared the results with two state-of-the-art methods.
Figure 2.Simulation results. (a) Ground truth. (b) High-resolution camera image. (c) The simulation results by different methods under various PPP and SBR. From top to bottom, each row corresponds to PPP ∼1 with SBR ∼0.1, PPP ∼10 with SBR ∼0.01, and PPP ∼10 with SBR ∼0.1, respectively. From left to right, each column shows the results reconstructed by Snyder et al. and Dai et al., proposed without and with upsampling, respectively.
Figure 2(c) demonstrates that conventional algorithms[26] struggle to accurately estimate the front-to-back position of a target because of range ambiguity. Dai et al.[23] achieved absolute distance recovery; however, this method leads to the presence of noise in the depth maps. Our proposed method reconstructs absolute distance by combining conventional camera images with single-photon LiDAR, reducing the impact of Poisson noise and thereby achieving higher reconstruction accuracy. Compared with Dai et al.’s method, it shows a lower RMSE, which demonstrates superior absolute distance reconstruction capabilities even with low photon counts and a low signal-to-background ratio (SBR). Besides, we have used conventional camera images for upsampling, which can enrich target details and remarkably improve image resolution. Compared to the results before upsampling, it has a lower RMSE.
By comparing our method and Dai et al.’s method in terms of root mean square error (RMSE) under the same conditions, we find that reconstructions relying purely on LiDAR data, especially in low PPP and low SBR scenarios, tend to have some noisy pixels. By using the upsampling guidance, our algorithm performs well. As shown in Fig. 3, our method outperforms Dai et al.’s method in terms of the RMSE. The trend of our results initially decreases and then stabilizes as SBR/PPP increases, demonstrating that our results achieve the best accuracy.
Figure 3.The RMSE in simulations with different PPP and SBR levels. (a) For PPP ∼1 with SBR ∼0.01, 0.05, and 0.1, the RMSE results are calculated by the methods of Dai et al., proposed with and without upsampling. (b) For SBR ∼0.1 with PPP ∼0.5, 1, 5, and 10, the RMSE results are calculated by the methods of Dai et al., proposed without and with upsampling.
The schematic of our long-range, high-resolution single-photon imaging system is shown in Fig. 4. We use a digital full-frame camera with a pixel resolution set to . The focal length of the objective lens of the camera is 400 mm. A raster scanning single-photon LiDAR using laser source with multiple repetition rates provides raw depth data. The scanning interval is set to be 100 µrad. Single-photon LiDAR uses a coaxial design, allowing for highly efficient detection over wide detection distances. To eliminate the local noise in this coaxial system, we set a temporal separation of laser emission and detection and employ two acousto-optic modulators (AOMs) for noise suppression. The system employs a 1550 nm fiber pulsed laser, and the period is adjustable through an external trigger, which is typically set between 1 and 2 µs. The maximum emission laser power of the system is 250 mW. The system includes a home-made InGaAs/InP single-photon avalanche diode (SPAD) detector with a detection efficiency of 30% and a dark count rate of 1.2 kcps (cps, counts per second). The system uses a home-made field programmable gate array (FPGA) board for precise timing control. Moreover, we use the pixel signals output from the micro-electromechanical system (MEMS) mirror to discern different pixel information and implement a scanning method where each pixel is illuminated by a specific frequency, with different frequencies employed for adjacent pixels.
Figure 4.The layout of the system. (a) Conventional high-resolution camera. (b) Single-photon LiDAR. (c) Data processing system.
As shown in Fig. 5(a), we imaged residential buildings located 0.4 to 1.6 kilometers away. The experiment was conducted under five different laser pulse periods (1 µs, 1.43 µs, 1.59 µs, 1.61 µs, 1.71 µs), with a per-pixel acquisition time of 330 µs. We collected a single-photon image of pixels, and the average PPP was . Guided by intensity information from the camera, we obtained absolute depth estimation shown in Fig. 5(d). Furthermore, using the extracted contour information of the same image, we successfully generated a depth map with 10-fold higher resolution () while maintaining high depth accuracy as illustrated in Fig. 5(e). By comparing Figs. 5(f) and 5(g), our method displays better detail of the building after upsampling. The comparison between Figs. 5(h) and 5(i) shows a superiority for capturing detailed 3D surfaces in complex urban environments. These results prove the robustness and accuracy of our method in practical applications.
Figure 5.The experimental results. (a) The target’s location on the map. (b) Photograph of our system. (c) High-resolution camera image of target. (d), (e) The results using our proposed method without and with upsampling. (f), (g) Closeup views of the building details in depth reconstructions [area highlighted by green rectangle in (c)]. (h), (i) 3D profiles of the eaves details in depth reconstructions [highlighted by blue rectangle in (c)].
We proposed and validated a fusion long-range 3D imaging method to overcome the challenges of range ambiguity and low resolution. The outdoor experimental results extended times the unambiguous range and imaged with over 3 megapixels (), a 10-fold increase in resolution. By providing accurate depth perception and fine spatial awareness, the results may offer enhanced methods for rapid, high-resolution, long-range 3D imaging for large-scale scenes. These are essential for target identification and environmental mapping in complex areas.
[14] D. Ferstl, C. Reinbacher, R. Ranftl et al. Image guided depth upsampling using anisotropic total generalized variation. Proceedings of the IEEE International Conference on Computer Vision, 993(2013).