Search by keywords or author
Journals >Laser & Optoelectronics Progress
Export citation format
Digital Image Processing
Detail-Preserving Multi-Exposure Image Fusion Based on Adaptive Weight
Ruihong Wen, Chunyu Liu, Shuai Liu, Meili Zhou, and Yuxin Zhang
Multi-exposure image fusion addresses the issue of insufficient image sensors for capturing scenes with large dynamic ranges. Multiple images with different exposure levels in the same scene are fused to obtain a large-dynamic-range image that contains rich scene details. A self-adaptive weight-detail-preserving multi-Multi-exposure image fusion addresses the issue of insufficient image sensors for capturing scenes with large dynamic ranges. Multiple images with different exposure levels in the same scene are fused to obtain a large-dynamic-range image that contains rich scene details. A self-adaptive weight-detail-preserving multi-exposure image-fusion algorithm is proposed to address the typical issues of insufficient image-detail preservation and edge halo in fusion. Contrast and structural components in image-block decomposition are used to extract fused structural weights and two-dimensional entropy is used to select brightness benchmarks to calculate exposure weights. Subsequently, saturation weights are used to better restore the brightness and color information of the scene in the fused image. Finally, double-pyramid fusion is used to fuse the source-image sequence at multiple scales to avoid unnatural halos at the boundaries and obtain a large-dynamic-range fused image that preserves more details. Seventy sets of multi-exposure images from three datasets are selected for experiments. The results show that the average values for the fusion-structure similarity and cross-entropy of the proposed algorithm are 0.983 and 2.341, respectively. Compared with classical or recent multi-exposure fusion algorithms, the proposed algorithm can maintain the brightness distribution of the scene while maintaining more image information, thus demonstrating its effectiveness. The proposed algorithm offers excellent fusion results and good visual effects..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837001 (2024)
Deblurring Light Field Images Based on Local Maximum Gradient and Minimum Intensity Priors
Zongchen Zhao, Chunyu Liu, Minglin Xu, Yuxin Zhang, Shuai Liu, and Huiling Hu
Space three-dimensional (3D) reconstruction is important across various domains, including remote sensing, military, and aerospace. Among these, light field imaging technology stands out as widely utilized. Enhancing the image quality of light field images is paramount for achieving more accurate 3D reconstructions. FiSpace three-dimensional (3D) reconstruction is important across various domains, including remote sensing, military, and aerospace. Among these, light field imaging technology stands out as widely utilized. Enhancing the image quality of light field images is paramount for achieving more accurate 3D reconstructions. First, integrating light field imaging into space imaging systems and designing a model based on wave optics streamline the imaging process, thereby simulating the original light field image. Subsequently, employing digital refocusing algorithms enables the acquisition of light field images at different focal planes. However, challenges such as errors induced by relative motion, inaccuracies in digital refocusing algorithms, and signal loss due to microlens arrays in the optical path lead to image blurring. Current image deblurring techniques could not fulfil the stringent quality standards of light field imaging. Hence, this study introduces an algorithm to alleviate blurring in remote sensing light field-refocused images. An energy function is constructed by leveraging the insight that image blur correlates with increased local minimum intensity values and decreased local maximum gradient values. An enhanced semi-quadratic splitting method facilitates the estimation of potential images and blur kernels, thus achieving deblurring. Experimental results demonstrate the superiority of the proposed algorithm over existing image deblurring techniques for processing light field-refocused images..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837002 (2024)
Hierarchical Matching Multi-Object Tracking Algorithm Based on Pseudo-Depth Information
Peng Hu, Shuguo Pan, Wang Gao, Ping Wang, and Peng Guo
A hierarchical matching multi-object tracking algorithm based on pseudo-depth information was proposed to address the performance limitations of traditional multi-object tracking methods that rely on intersection over union (IOU) for association under target occlusion, as well as the constraints of feature re-identificA hierarchical matching multi-object tracking algorithm based on pseudo-depth information was proposed to address the performance limitations of traditional multi-object tracking methods that rely on intersection over union (IOU) for association under target occlusion, as well as the constraints of feature re-identification in dealing with visually similar objects. The proposed algorithm utilized a stereo geometric approach to acquire pseudo-depth information of objects in the image. Based on the magnitude of pseudo-depth, both the detection boxes and trajectories were divided into multiple distinct subsets. When some objects were occluded but had significant differences in pseudo-depth, they were classified into different pseudo-depth levels, thereby avoiding matching conflicts. Subsequently, a pseudo-depth cost matrix was computed using the pseudo-depth information, and an IOU pseudo-depth (IOU-D) matching was performed within the same pseudo-depth level to associate occluded targets located at the same pseudo-depth level. Experimental results show that the proposed algorithm achieved 65.1% and 58.5% higher order tracking accuracy (HOTA) on the MOT17 and DanceTrack test sets, respectively. Compared to the baseline model, ByteTrack, the proposed algorithm improved by 2.0% and 10.8% on the two data sets, respectively. Experimental results indicate that effectively utilizing the potential pseudo-depth information in the image can significantly enhance the tracking accuracy of occluded targets..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837003 (2024)
Intelligent Optimization Algorithm for Codeword Searching for a Coded Exposure Camera
Peipei Zhou, Jiayi Yan, Huannan Qi, Tao Sun, and Xinglin Hou
Traditional camera imaging suffers from insufficient preservation of high-frequency information, inaccurate solution of encoding exposure codewords, and difficulty in estimating blur kernels. Toward solving these problems, this study focuses on a codeword searching method for coded exposure cameras and proposes an inteTraditional camera imaging suffers from insufficient preservation of high-frequency information, inaccurate solution of encoding exposure codewords, and difficulty in estimating blur kernels. Toward solving these problems, this study focuses on a codeword searching method for coded exposure cameras and proposes an intelligent optimized cyclic search strategy based on a memetic algorithm framework. A mutation crossover operator is used in differential evolution to obtain a global solution; subsequently, a taboo search is performed to conduct a local search on the global solution, thereby iteratively searching to obtain an optimal codeword sequence. A loss function suitable for encoding exposure image restoration is designed, and an end-to-end blind deconvolution kernel generative adversarial network is used to compare the performances of different codeword acquisition methods for blurry image restoration. Experimental results show that the proposed intelligent optimization algorithm can solve the codeword sequence more accurately and with better robustness than the other methods. When using the same network for blurred image restoration, the proposed algorithm yields superior restoration results compared with the existing methods from subjective and objective perspectives. Thus, the proposed method has a high engineering application value for enhancing motion blur restoration..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837004 (2024)
Self-Supervised Pre-Training for Intravascular Ultrasound Image Segmentation Method Based on Diffusion Model
Wenyue Hao, Huaiyu Cai, Tingtao Zuo, Zhongwei Jia, Yi Wang, and Xiaodong Chen
To overcome the difficulty of obtaining large annotated datasets, a proxy task based on a diffusion model was introduced, allowing for self-supervised learning of a priori knowledge from unlabeled datasets, followed by fine-tuning on a small labeled dataset. Inspired by the diffusion model, different levels of noise arTo overcome the difficulty of obtaining large annotated datasets, a proxy task based on a diffusion model was introduced, allowing for self-supervised learning of a priori knowledge from unlabeled datasets, followed by fine-tuning on a small labeled dataset. Inspired by the diffusion model, different levels of noise are weighted with the original images as inputs to the model. By training the model to predict the input noise, a more robust learning of the representation of intravascular ultrasound (IVUS) images at the pixel level was achieved. Additionally, the combined loss function of mean square error (MSE) and structural similarity index (SSIM) was introduced to improve the performance of the model. The experimental results of this method on 20% dataset demonstrate that the Jaccard metric coefficients of the lumen and meida are increased by 0.044 and 0.101, respectively, compared with result of random initialization, and the Hausdorff distance coefficients are improved by 0.216 and 0.107, respectively, compared with result of random initialization, which is similar to the result of using 100% dataset for training. This framework applies to any structural image segmentation model and significantly reduces the reliance on ground truth while ensuring segmentation effectiveness..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837005 (2024)
Image Stitching Combining Enhanced Optimal Seam and Optimized Brightness
Xiangtan Yu, Yaohong Zhao, and Wei Xiang
An image stitching algorithm that combines enhanced optimal seam and optimized brightness is proposed to address the issue of ghosting and inconsistent brightness in panoramic images arising from large parallax and exposure differences. First, a deformation model based on minimizing projection biases was used for imageAn image stitching algorithm that combines enhanced optimal seam and optimized brightness is proposed to address the issue of ghosting and inconsistent brightness in panoramic images arising from large parallax and exposure differences. First, a deformation model based on minimizing projection biases was used for image registration to accurately align overlapping area. Second, an enhanced optimal seam algorithm was implemented between two intersections in the overlapping area to avoid information loss in the panoramic image. Finally, leveraging the Poisson fusion image, the energy functional of the ideal panoramic image gradient and a nonuniform illumination fitting model were constructed to optimize the brightness and improve the brightness consistency of the panoramic image. Experimental results show that compared with the algorithm proposed in reference [11 ], the proposed algorithm improves the structural similarity by 5.58% and peak-signal-to-noise ratio by 9.55% in terms of eliminating large parallax. Compared with before optimization, the proposed algorithm reduces the average gradient of the illumination component by 14.90% and improves the average gradient by 12.09% in terms of eliminating exposure differences. Thus, the algorithm can be used for image stitching in scenes with large disparities and exposure differences..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837006 (2024)
Two-Branch Feature Fusion Image Dehazing Algorithm Under Brightness Constraint
Jinqing He, Xiucheng Dong, Xianming Xiang, Hongda Guo, and Yaling Ju
In order to solve the problem of haze weather affecting image quality, this paper proposed a two-branch feature fusion image dehazing algorithm. Firstly, the data fitting branch of dense residual form increased the network depth and extracted high-frequency detail features. The knowledge transfer branch of U-Net form pIn order to solve the problem of haze weather affecting image quality, this paper proposed a two-branch feature fusion image dehazing algorithm. Firstly, the data fitting branch of dense residual form increased the network depth and extracted high-frequency detail features. The knowledge transfer branch of U-Net form provided supplemental knowledge to the finite data. Then the multi-scale fusion module adaptively fused feature of two branches to recover high-quality dehazing images. In addition, brightness constraint was introduced to combined loss function to assign higher weights to the dense haze region. Finally, both synthetic and real-world datasets were used for testing and compared with existing dehazing algorithms such as FFA and GCANet. Experimental results showed that the proposed algorithm had good dehazing effect both on synthetic and real hazy images. And compared with other comparison algorithms, the average value of peak signal to noise ratio on four nonhomogeneous haze datasets was increased by 1.55 dB?10.30 dB and the average value of structural similarity was increased by 0.0312?0.2440..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837007 (2024)
Calculation Method of Expected Sharpness Value for Region of Interest in Ventral Subregion Image Based on Standard Deviation-Weighted Gaussian Filter Function and Multidirectional Sobel Operator
Zhiyong Zhang, Ninghui Pan, and Tingyu Zhao
A method for calculating the expected clarity value of region-of-interest (ROI) central subregion images is proposed by segmenting an ROI into ROI central, subcentral, and edge subregions. In particular, the ROI was segmented horizontally and vertically into multiple odd-numbered ROI subregions, and different standard A method for calculating the expected clarity value of region-of-interest (ROI) central subregion images is proposed by segmenting an ROI into ROI central, subcentral, and edge subregions. In particular, the ROI was segmented horizontally and vertically into multiple odd-numbered ROI subregions, and different standard deviation-weighted Gaussian filtering functions were used to filter and denoise different ROI subregions. The farther away from the ROI central subregion, the larger is the standard deviation between the ROI subcenter and edge subregions. This ensures the clarity value of the ROI central subregion image while effectively reduces the clarity value of the ROI edge subregion, thus providing reliable data for subsequent calculations of the expected clarity for the ROI image. Additionally, the conventional two-dimensional 3×3 Sobel operator was extended to a four-directional 5×5 Sobel operator, thus resulting in stronger edge responses and better clarity curves. Subsequently, the algorithm above was implemented using field programmable gate array (FPGA) high-speed image-processing technology, which significantly reduced the computation time. Experimental results show that the proposed method effectively eliminates the effect of noise on the expected clarity value of the ROI images and significantly reduces the details pertaining to the ROI edge-subregion images, thereby ensuring focus on the ROI center subregion continuously. Compared with software computing, FPGA presents a higher computing speed and offers better real-time performance, with a computing speed 130 times that of software computing..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837008 (2024)
Lightweight Template Matching Algorithm Based on Rendering Perspective Sampling
Daizhou Wen, Xi Wang, and Mingjun Ren
As a classical computer vision perception task, pose estimation is commonly used in scenarios such as autonomous driving and robot grasping. The pose estimation algorithm based on template matching is advantageous in detecting new objects. However, current state-of-the-art template matching methods based on convolutionAs a classical computer vision perception task, pose estimation is commonly used in scenarios such as autonomous driving and robot grasping. The pose estimation algorithm based on template matching is advantageous in detecting new objects. However, current state-of-the-art template matching methods based on convolutional neural networks generally suffer from large memory consumption and slow speed. To solve these problems, this paper proposes a deep learning-based lightweight template matching algorithm. The method, which incorporates depth-wise convolution and the attention mechanism, drastically reduces the number of model parameters and has the capability to extract more generalized image features. Thus, the accuracy of position estimation for unseen and occluded objects is improved. In addition, this paper proposes an iterative rendering perspective sampling strategy to significantly reduce the number of templates. Experiments on open-source datasets show that the proposed lightweight model utilizes only 0.179% of the parametric quantity of the commonly used template matching model, while enhancing the average pose estimation accuracy by 3.834%..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837009 (2024)
Improved Two-Stage 3D Object Detection Algorithm for Roadside Scenes with Enhanced PointPillars and Transformer
Liangzi Wang, Miaohua Huang, Ruoying Liu, Chengcheng Bi, and Yongkang Hu
This study proposes a two-stage three-dimensional object detection algorithm tailored for roadside scenes, aiming to address the challenges of high missed detection rates for long-distance vehicles and high false detection rates for pedestrians in complex scenes involved in cloud object detection tasks. This algorithm This study proposes a two-stage three-dimensional object detection algorithm tailored for roadside scenes, aiming to address the challenges of high missed detection rates for long-distance vehicles and high false detection rates for pedestrians in complex scenes involved in cloud object detection tasks. This algorithm improves PointPillars and Transformer. In the first stage of the algorithm, the PointPillars-based backbone network incorporates the SimAM attention mechanism to capture similarity information, prioritizing essential features. This stage replaces standard convolutional blocks in the downsampling section with residual structures to improve network performance. The second stage of the algorithm utilizes Transformer to refine the candidate boxes generated in the first stage: the encoder constructs the original point features for encoding, while the decoder employs channel weighting to enhance channel information, thereby enhancing detection accuracy and mitigating false detection. The effectiveness of the proposed algorithm was tested on the DAIR-V2X-I roadside dataset and the KITTI vehicle-end dataset. Experimental results demonstrated substantial improvements in detection accuracy over other publicly available algorithms. Compared with the benchmark algorithm PointPillars, for moderate detection difficulty, accuracy improvements in detecting cars, pedestrians, and cyclists on the DAIR-V2X-I dataset were 1.9 percentage points, 10.5 percentage points, and 2.11 percentage points, respectively. Moreover, corresponding improvements on the KITTI dataset were 2.34 percentage points, 4.73 percentage points, and 8.17 percentage points, respectively..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837010 (2024)
Ground-Based Cloud Image Segmentation Network Based on Improved MobileNetV2
Hongkun Bu, Shuai Chang, Ye Gu, Chunyu Guo, Chengbang Song, Wei Xu, Lü Tianyu, Wei Zhao, and Shoufeng Tong
In the field of atmospheric measurement, clouds are the most uncertain factor in atmospheric models, so accurate segmentation and recognition of cloud image are indispensable. However, due to the stochastic nature of clouds and atmospheric conditions, challenges exist in the precision and accuracy of cloud image segmenIn the field of atmospheric measurement, clouds are the most uncertain factor in atmospheric models, so accurate segmentation and recognition of cloud image are indispensable. However, due to the stochastic nature of clouds and atmospheric conditions, challenges exist in the precision and accuracy of cloud image segmentation. To address this issue, we propose a novel network named CloudHS-Net based on MobileNetV2. This network incorporates a hybrid concatenation structure, dilated convolutions, and a mixed dilated design, along with an efficient channel attention mechanism, for practical cloud image segmentation. The performance of the network is thoroughly evaluated on the SWIMSEG and HHCL-Cloud datasets through comparative tests with other advanced models, providing insights into the network's performance and the roles of its various components. Experimental results demonstrate that the efficient channel attention and hybrid concatenation structures effectively enhance the segmentation performance of the model. Compared to current advanced ground-based cloud image segmentation networks, CloudHS-Net excels in the task of sky cloud image segmentation, achieving an accuracy of 95.51% and mean intersection over union (MIoU) of 89.86%. The model reduces disturbances originating from atmospheric environment, such as sunlight, pay stronger attention to cloud. This leads to enhanced precision in cloud image segmentation, allowing for a more accurate capture of cloud coverage status and the experimental results show that the method is feasible..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837011 (2024)
Indoor Structure Method Based on Motion Assisted by Geomagnetic Features
Zhoumeng He, Guoliang Chen, Mingcong Shu, Kaiyu Di, and Hu Liu
To solve the problems that the complexity and closure of indoor scenes lead to the time-consuming and poor coverage of the reconstructed indoor 3D model, a method for indoor structure from motion (SFM) assisted by geomagnetic features is proposed. First, ordinary smartphone sensors were used to obtain indoor images andTo solve the problems that the complexity and closure of indoor scenes lead to the time-consuming and poor coverage of the reconstructed indoor 3D model, a method for indoor structure from motion (SFM) assisted by geomagnetic features is proposed. First, ordinary smartphone sensors were used to obtain indoor images and geomagnetic data. Second, to divide the overall image set into local image sets, a clustering algorithm was used to cluster geomagnetic data, and the clustering results of the geomagnetic data were used as attributes of the corresponding images to obtain the local image sets. Subsequently, the hierarchical SFM was used to construct sparse sub models for each local image set, and the matching points between each sparse sub model were determined. Finally, the RANSAC generalized Procrustes analysis (RGPA) algorithm was used to register local reconstructions and obtain a complete model. Experimental results of indoor reconstruction on the same and different floors show that the proposed method performs well in terms of reconstruction efficiency, reconstruction coverage, and point-cloud generation rate. Compared with the hierarchical SFM method, the proposed method offers a higher reconstruction efficiency by 37% on both datasets, and its reconstruction coverage is closer to the reconstruction target, thus providing a supplementary solution for constructing the same type of indoor environment..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837012 (2024)
Hyperspectral Image Classification Based on Enhanced Dynamic-Graph-Convolutional Feature Extraction
Tie Li, Qiaoyu Gao, and Wenxu Li
Herein, a hyperspectral image classification algorithm that integrates convolutional network and graph neural network is proposed to address several challenges, such as high spectral dimensionality, uneven data distribution, inadequate spatial-spectral feature extraction, and spectral variability. First, principal compHerein, a hyperspectral image classification algorithm that integrates convolutional network and graph neural network is proposed to address several challenges, such as high spectral dimensionality, uneven data distribution, inadequate spatial-spectral feature extraction, and spectral variability. First, principal component analysis is performed to reduce the dimensionality of hyperspectral images. Subsequently, convolutional networks extract local features, including texture and shape information, highlighting differences between various objects and regions within the image. The extracted features are then embedded into the superpixel domain, where dynamic graph convolution occurs via an encoder. A dynamic adjacency matrix captures the long-term spatial context information in the hyperspectral image. These features are combined through a decoder to effectively classify different pixel categories. Experiments conducted on three commonly used hyperspectral image datasets demonstrate that this method outperforms five other classification techniques with regard to classification performance..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837013 (2024)
Low-Light Image Stitching Method Based on Improved SURF
Yu Ji, Peng Ding, Nan Liu, Zhanqiang Ru, Zhenyao Li, Suzhen Cheng, Zhengguang Wang, Jingwu Gong, Zhizhen Yin, Fei Wu, and Helun Song
Low-light image stitching is a technique that enables the stitching of images taken from different perspectives into a large field-of-view image under insufficient lighting conditions. The low contrast and high noise of images caused by inadequate lighting compromise the robustness and quantity of feature extraction, mLow-light image stitching is a technique that enables the stitching of images taken from different perspectives into a large field-of-view image under insufficient lighting conditions. The low contrast and high noise of images caused by inadequate lighting compromise the robustness and quantity of feature extraction, making feature matching and image stitching challenging. In response, this study proposes a low-light image stitching method based on an improved speeded-up robust feature (SURF) algorithm. In this method, a scale space was constructed first using the integral image of low-light images and Laplacian operations were performed, followed by edge extraction and binarization of the images. Further, the edges-in-shaded-region (ESR) image was generated based on the edge-extracted and binarized images to obtain scale weights, thereby dynamically adjusting the SURF feature extraction threshold. This effectively resolves the issue of mismatch between feature point pixel thresholds and overall image brightness, enhancing the robustness of the feature extraction algorithm. Additionally, the obtained scale weights can serve as weighting coefficients for the multiscale Retinex algorithm to achieve better image enhancement effects. In this method, binary descriptors were employed to accelerate the feature description and matching process. Finally, a homography matrix was calculated based on matching relationships to perform homography transformation and stitching of the enhanced images. Experimental results demonstrate that the proposed algorithm effectively improves the speed and performance of low-light image stitching, offering better robustness and adaptability compared with the traditional SURF algorithm..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837014 (2024)
Semantic Segmentation of Dual-Source Remote Sensing Images Based on Gated Attention and Multiscale Residual Fusion
Wen Guo, Hong Yang, and Chang Liu
The semantic segmentation of remote sensing images is a crucial step in the analysis of geographic-object-based remote sensing images. Combining remote sensing image data with elevation data effectively enhances feature complementarity, thereby improving pixel-level segmentation accuracy. This study proposes a dual-souThe semantic segmentation of remote sensing images is a crucial step in the analysis of geographic-object-based remote sensing images. Combining remote sensing image data with elevation data effectively enhances feature complementarity, thereby improving pixel-level segmentation accuracy. This study proposes a dual-source remote sensing image semantic segmentation model, STAM-SegNet, that leverages the Swin Transformer backbone network to extract multiscale features. The proposed model integrates an adaptive gating attention mechanism and a multiscale residual fusion strategy. The adaptive gated attention mechanism includes gated channel attention and gated spatial attention mechanisms. Gated channel attention enhances the correlation between dual-source data features through competition/cooperation mechanisms, effectively extracting complementary features of dual-source data. In contrast, gated spatial attention uses spatial contextual information to dynamically filter out high-level semantic features and select accurate detail features. The multiscale feature residual fusion strategy captures multiscale contextual information via multiscale refinement and residual structure, thereby emphasizing detailed features, such as shadows and boundaries, and improving the model's training speed. Experiments conducted on the Vaihingen and Potsdam datasets demonstrate that the proposed model achieved an average F1-score of 89.66% and 92.75%, respectively, surpassing networks such as DeepLabV3+, UperNet, DANet, TransUNet, and Swin-UNet in terms of segmentation accuracy..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837015 (2024)
Multiple Inspection Object Detection Algorithm Based on Hough Transform
Bibo Tian, Yunmeng Liu, and Lei Ding
To address the issue of accurate detection and identification of faint targets in geosynchronous orbit space under the background of starry sky, a multiple inspection object detection algorithm based on Hough transform is proposed. This study analyzes the characteristics of space targets in a geosynchronous orbit and tTo address the issue of accurate detection and identification of faint targets in geosynchronous orbit space under the background of starry sky, a multiple inspection object detection algorithm based on Hough transform is proposed. This study analyzes the characteristics of space targets in a geosynchronous orbit and the difficulties in detection and identification, as well as the shortcomings of traditional target detection algorithms. By using the continuous multi-frame images through denoising, threshold segmentation, centroid extraction, and star map matching, the influence of most of the stars is filtered out. The multi-frame images are then superimposed using Hough transform, and multiple tests are conducted to achieve accurate target extraction, which significantly improves the applicability of Hough transform in the detection of weak targets in space. The effectiveness of proposed algorithm is verified through field experiments and simulation data analysis. Compared with the traditional Hough algorithm, the detection accuracy is increased by 62.5%, the false alarm rate is reduced by 74.9%, and the time consumption of the algorithm is reduced by 7.2%; moreover, the detection accuracy is greater than 98% and the false alarm rate is less than 2% when the signal-to-noise ratio is greater than or equal to 3..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837016 (2024)
Pigment Classification Method of Mural Multi-Spectral Image Based on Multi-Scale Superpixel Segmentation
Yamin Chen, Ke Wang, Zhan Wang, Huiqin Wang, Yuan Li, and Gang Zhen
In pigment classification of mural multi-spectral image, traditional algorithms typically extract the spatial features of the image through the fixed pane. Specifically, the spatial relationship between different pigments is ignored, and the classification error of pigments in the halo area is large. Furthermore, the fIn pigment classification of mural multi-spectral image, traditional algorithms typically extract the spatial features of the image through the fixed pane. Specifically, the spatial relationship between different pigments is ignored, and the classification error of pigments in the halo area is large. Furthermore, the feature extraction method of a single scale cannot effectively express the differences between pigment blocks. In this study, a pigment classification method for mural multi-spectral images based on multi-scale superpixel segmentation is proposed. First, the dimensionality of mural multi-spectral data is reduced by using adaptive band optimization method, which effectively reduces the amount of data required for superpixel segmentation. Second, the pseudo-color image synthesized by the first three bands after the band optimization and dimensionality reduction is segmented based on gradient constraint. It leads to segmentation results that are more close to the actual contour and improves the accuracy of pigment classification. Third, the selected sample pixels are mapped into the super pixels to realize the spatial information and feature enhancement of the image. Finally, given that a single scale cannot be accurately applied to each pigment block, multi-scale superpixels are used to segment false-color mural images, obtain segmentation maps of different scales, perform mean filtering in the same superpixel label region of the segmentation map, and use support vector machine (SVM) classifier to classify the multi-scale superpixel segmentation images. A fusion decision strategy based on majority voting is adopted to obtain the final classification result. The experimental results show that the proposed method can realize an overall accuracy of 98.84% and average accuracy of 97.75% on the simulated mural multi-spectral image dataset. Hence, the proposed method can provide more accurate classification results than the control group..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1837017 (2024)
Imaging Systems
Deep-Learning-Based Self-Absorption Correction for Fan Beam X-Ray Fluorescence Computed Tomography
Mengying Sun, Shanghai Jiang, Xiangpeng Li, Xin Huang, Bin Tang, Xinyu Hu, Binbin Luo, Shenghui Shi, Mingfu Zhao, and Mi Zhou
In X-ray fluorescence computed tomography (XFCT) imaging, the absorption attenuation of incident X-rays and fluorescent X-rays by the sample is a critical factor that restricts high-quality image reconstruction. This study proposes a deep-learning-based self-absorption correction method for XFCT, which utilizes a convoIn X-ray fluorescence computed tomography (XFCT) imaging, the absorption attenuation of incident X-rays and fluorescent X-rays by the sample is a critical factor that restricts high-quality image reconstruction. This study proposes a deep-learning-based self-absorption correction method for XFCT, which utilizes a convolutional neural network based on U-Net to learn the symmetric structure distribution in the original projection data and recover complete projection data from the sinograms affected by self-absorption. Through numerical simulation, a fan-beam XFCT imaging system was established to obtain 20000 sets of fluorescence sinograms, which were then used for network training, testing, and validation. The projection data affected by self-absorption were further validated through a simulation using Geant4 software. The results indicate that the well-trained neural network can achieve self-absorption correction on incomplete projection data, thereby improving the quality of reconstructed images..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1811001 (2024)
Influence of Target Surface BRDF on Non-Line-of-Sight Imaging
Yufeng Yang, Ao Zhang, Youcheng Guo, and Wenzhuo Zhu
In order to investigate the impact of target surface material on the performance of non-line-of-sight imaging algorithms, different types of bidirectional reflection distribution functions were studied, and the confocal diffusion tomography algorithm in the field of non-line-of-sight imaging was used to simulate and reIn order to investigate the impact of target surface material on the performance of non-line-of-sight imaging algorithms, different types of bidirectional reflection distribution functions were studied, and the confocal diffusion tomography algorithm in the field of non-line-of-sight imaging was used to simulate and reconstruct non-line-of-sight targets. Finally, glass fiber, purple red paint board, and cement board were taken as examples, a more in-depth exploration of the impact of different object materials on imaging quality was conducted. The simulation results show that for hidden objects, targets with high specular reflectance have blurry reconstruction results, while targets with high specular and diffuse reflectance have less ideal reconstruction quality. Targets with high diffuse reflectance have ideal reconstruction quality, while targets without specular reflectance have clear reconstruction results. In target objects with different scattering characteristics, objects with a smaller proportion of specular reflection are reconstructed with higher quality. This research results provide guidance for the selection of target surface materials in the field of non-line-of-sight targets..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1811003 (2024)
Instrumentation, Measurement and Metrology
Pantograph Safe Trigger Target Real-Time Detection and Localization Method Based on Fused Differential Convolutional
Zhanshan Yang, Ying Zhang, Hongzhi Du, Yanbiao Sun, and Jigui Zhu
Aiming at the problems of existing target detection algorithms, a real-time target detection and localization method based on fused differential convolution is proposed. Firstly, a backbone network with fused differential convolution is constructed to enhance feature extraction capabilities. Then, a feature fusion moduAiming at the problems of existing target detection algorithms, a real-time target detection and localization method based on fused differential convolution is proposed. Firstly, a backbone network with fused differential convolution is constructed to enhance feature extraction capabilities. Then, a feature fusion module and detection head with shared weights are designed to improve detection speed and accuracy. Finally, a multi-stage training strategy is formulated to further enhance accuracy. Experimental results on the pantograph detection dataset show that the proposed method achieves a frame detection speed of up to 149 frame/s on CPU hardware resources, with an whole mean average precision (mAP) of 81.20%. This is an improvement of 57 frame/s and 6 percentage points compared to the FemtoDet algorithm. Proposed method meets technical requirements for real-time and accurate triggering positioning tasks in high-speed railway scenarios..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812001 (2024)
Multimodal Fusion Odometer Based on Deep Learning and Kalman Filter
Long Li, Yi An, Lirong Xie, Zhuo Sun, and Hongxiang Dong
Odometry is an important component of simultaneous localization and mapping (SLAM) technology. Existing odometry algorithms mainly rely on visual or laser sensors, failing to fully exploit the advantages of multimodal sensors and exhibiting insufficient robustness in feature-deprived scenarios and complex environments.Odometry is an important component of simultaneous localization and mapping (SLAM) technology. Existing odometry algorithms mainly rely on visual or laser sensors, failing to fully exploit the advantages of multimodal sensors and exhibiting insufficient robustness in feature-deprived scenarios and complex environments. To address this issue, this paper utilizes data from multimodal sensors including lidar, color camera, and inertial measurement unit, and proposes a multimodal fusion deep network, MLVIO-Net, which collaborates with an error state Kalman filter (ESKF) to form a multimodal fusion odometry system. MLVIO-Net consists of a feature pyramid network, multi-layer bidirectional long-short term memory (Bi-LSTM) network, pose estimation network, and pose optimization network, achieving close integration of multimodal data. The feature pyramid network performs hierarchical feature extraction on lidar point clouds, while the LSTM network effectively learns the temporal features of inertial measurement data. The pose estimation and optimization networks iteratively refine the predicted results. The ESKF predicts poses using the kinematic model of the inertial measurement unit and corrects poses using the predictions from MLVIO-Net, thereby improving prediction accuracy and significantly enhancing the output frame rate of the odometry. Experimental results on the open dataset KITTI demonstrate that the proposed multimodal fusion odometry exhibits higher accuracy and robustness compared to other common algorithms..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812002 (2024)
Defect Detection of Spray Printed Variable Color 2D Code Based on ResNet34-TE
Ying Li, Yao Dong, Zifen He, Hao Yuan, Fuyang Sun, and Lingxi Gong
Addressing the defect characteristics of multicolor interference and the high complexity of spray-printed variable color 2D codes, along with the challenges of insufficient accuracy and low efficiency in current detection methods used by printing enterprises, this paper proposes a defect classification model by integraAddressing the defect characteristics of multicolor interference and the high complexity of spray-printed variable color 2D codes, along with the challenges of insufficient accuracy and low efficiency in current detection methods used by printing enterprises, this paper proposes a defect classification model by integrating ResNet34 and Transformer structure (ResNet34-TE). Initially, a color 2D code defect dataset is constructed, followed by the introduction of a contour shape detection method to identify the target region and mitigate background interference. ResNet34 serves as the backbone network for feature extraction. In a significant modification, the average pooling layer is omitted, and a Transformer encoder layer is employed to capture the global information of the extracted features, emphasizing the region of interest. Experimental results demonstrate that the accuracy of ResNet34-TE reaches 96.80%, with the average detection time for a single sheet reduced to 15.59 ms. This represents a 5.3 percentage points improvement in accuracy and a 5.8% enhancement in detection speed compared to the baseline model, outperforming classical models. Additionally, on the public defect detection dataset NEU-DET, the proposed model achieves an accuracy of 98.86%, surpassing mainstream defect classification algorithms. Consequently, the proposed model exhibits superior classification effectiveness in defect recognition..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812003 (2024)
Anti-Disturbance Cross-Scene Multispectral Imaging Pigment Classification Method for Painted Cultural Relics
Ruanzhao Guo, Ke Wang, Huiqin Wang, Zhan Wang, Gang Zhen, Yuan Li, and Jiachen Li
The environment at cultural relics protection sites restricts the ability to image large-area painted cultural relics at once, necessitating the use of multi-lens imaging to acquire complete high spatial resolution multispectral data. However, challenges such as uneven illumination, spectral noise, and other disturbancThe environment at cultural relics protection sites restricts the ability to image large-area painted cultural relics at once, necessitating the use of multi-lens imaging to acquire complete high spatial resolution multispectral data. However, challenges such as uneven illumination, spectral noise, and other disturbances during split-lens imaging can cause spectral dimension offsets, reducing the accuracy of pigment classification. To address this issue, a method for classifying pigments in painted cultural relics using cross-scene multispectral imaging resistant to spectral disturbances has been proposed. First, the overlapping regions of adjacent sub-shot images are extracted based on scale-invariant features, and the histogram specification method is used to eliminate the spectral shifts with the mean gray value of the overlapping regions as the benchmark. Spatial-spectral features are extracted through a deep codec, which randomizes the generation of variable spatial-spectral information to endow it with cross-scene domain shift properties. The model's responsiveness to key spectral channels is enhanced through the spectral channel attention mechanism. Optimization of the generator via an adversarial learning strategy further enhances model generalization capability. The experimental results from simulated and real mural painting datasets demonstrate that the algorithm, utilizing anti-spectral perturbation data, achieves an average improvement of 4.13% in overall classification accuracy and a 5.65% increase in the Kappa coefficient. In cross-scene painted cultural relics pigment classification experiments, the overall classification accuracy of the pigment is enhanced by 4.01%, and the Kappa coefficient by 3.16%..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812004 (2024)
Research on the Retention Time of Sweat Latent Fingerprints on Glass by Hyperspectral Combination with Multiple Models
Pengyu Tang, and Zhen Wang
This study explores the prediction of latent sweat fingerprint retention time on glass using hyperspectral imaging combined with multiple models. The hyperspectral image data of latent sweat fingerprints on glass were collected, and Savitzky-Golay (SG) convolutional smoothing and standard normal variate transformation This study explores the prediction of latent sweat fingerprint retention time on glass using hyperspectral imaging combined with multiple models. The hyperspectral image data of latent sweat fingerprints on glass were collected, and Savitzky-Golay (SG) convolutional smoothing and standard normal variate transformation were performed on the original spectral data. The feature bands were selected using successive projections algorithm, and then support vector machine (SVM), genetic algorithm back propagation (GA-BP) neural network, and partial least squares regression (PLSR) models were constructed and compared for predicting the latent sweat fingerprint retention time on glass in both full and feature bands. The results indicate that these three models are not applicable in the full band. In the feature band, the values of root mean square error of prediction of SVM, GA-BP neural network, and PLSR models reached 3.247 d, 3.035 d, and 3.060 d, respectively, with coefficient of determination reaching 0.627, 0.659, and 0.606, respectively. The relative percent deviation is higher than 1.4 with all the three models, thus predicting the retention time of fingerprints to a certain extent. Notably, hyperspectral imaging technology combined with multiple models can be used to predict the retention time of sweat latent fingerprints on glass..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812005 (2024)
3D Object Detection Algorithm Based on Improved YOLOv5
Xueqing Sheng, Shaobin Li, Jinyan Qu, and Liu Liu
To address the challenge of handling large volumes of point cloud data for three-dimensional (3D) object detection and the limited effectiveness in detecting small objects, in this study, an enhanced 3D target detection method is proposed that improves the YOLOv5 network based on the idea of Complex-YOLO algorithm. TheTo address the challenge of handling large volumes of point cloud data for three-dimensional (3D) object detection and the limited effectiveness in detecting small objects, in this study, an enhanced 3D target detection method is proposed that improves the YOLOv5 network based on the idea of Complex-YOLO algorithm. The proposed approach first tackles the issue of lengthy processing times due to extensive point cloud data by adopting the Complex-YOLO strategy of converting point cloud data into an RGB-Map format, which is more manageable for the YOLOv5 network. Enhancements to YOLOv5 include an angle prediction branch and a rotation frame regression loss function to accurately position rotating targets within the RGB-Map. Additionally, the YOLOv5 architecture is modified to better detect small objects by incorporating a feature fusion layer and a dedicated prediction head, which heightens the network's sensitivity to smaller targets. Furthermore, the convolutional block attention module (CBAM) attention mechanism is integrated into the network's neck to further enhance detection sensitivity. Experimental evaluations on the KITTI dataset confirm the superiority of the modified YOLOv5 method over the original Complex-YOLO, with improvements in mean average precision (mAP): Car type mAP increased by 7.48 percentage points, Pedestrian type by 12.54 percentage points, Cyclist type by 1.2 percentage points, and an overall increase of 7.08 percentage points across all categories, demonstrating the effectiveness of this algorithm..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812006 (2024)
Vertical Deformation Monitoring Method Based on Monocular Vision and Near-Infrared Targets
Fang Zhao, Yongbo Yang, Yu Zou, Shaoping Liu, and Jinfeng He
To satisfy the requirements of all-weather, high-precision, and online deformation monitoring of large facilities such as bridges and tunnels, a vertical deformation monitoring method based on monocular vision and near-infrared targets is proposed. Near-infrared target lights are installed at several measurement pointsTo satisfy the requirements of all-weather, high-precision, and online deformation monitoring of large facilities such as bridges and tunnels, a vertical deformation monitoring method based on monocular vision and near-infrared targets is proposed. Near-infrared target lights are installed at several measurement points on tested structures. A center positioning method based on least-squares ellipse fitting is used to automatically track and determine the center image coordinates of the near-infrared target light spot. Subsequently, the actual vertical deformation of the measurement points on a tested structure is calculated using the imaging geometric relationship based on the true elevation angle. The indoor accuracy verification experiment determined the maximum indication error of this method to be 0.043 mm at a distance of 8.457 m. In the deformation monitoring application test of a certain track bridge, the 24 h vertical deformation monitoring results are highly consistent with the subway operation time. Also, the vertical deformation is within the range of -4?6 mm under the effect of subway loads, indicating that this method can be applied to all-weather, high-precision, real-time, and online deformation monitoring of actual engineering structures..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812007 (2024)
Fine-Grained Lock Cylinder Hole Recognition Based on the Progressive Fusion of Cross-Granularity Features
Kunhua Zhu, Lei Sun, Yipeng Liao, Xin Yan, and Feifei Cheng
A fine-grained image recognition method based on the progressive fusion of cross-grained features is proposed to address the problems of small differences between classes of fine-grained images, difficulty in capturing discriminative features and low recognition accuracy. First, the random region confusion module is usA fine-grained image recognition method based on the progressive fusion of cross-grained features is proposed to address the problems of small differences between classes of fine-grained images, difficulty in capturing discriminative features and low recognition accuracy. First, the random region confusion module is used to generate images with different granularity levels for training various stages of the backbone network ConvNeXt. Second, the image representation of differing granularity in the middle layer of the model is enhanced using the random sample-swapping module. Then, the progressive multigranularity training strategy and mutual belief channel loss function are used for model training to fuse cross-granular information collaboratively. Finally, to obtain the final recognition results, the multigranularity features are integrated and fused to combine the classifiers. The experimental results demonstrate that the recognition accuracies of this method on three public datasets are 92.8% (CUB-200-2011), 95.5% (Stanford Cars), and 94.0% (FGVC-Aircraft), which are better than the current mainstream fine-grained image recognition methods. The recognition accuracy on the self-constructed Lock-Hole dataset reaches 97.3%, and the average recognition time of a single image is 0.016 s, which can realize the accurate recognition of the lock-hole image and satisfy the requirement of fast lock-hole recognition in emergency unlocking scenarios..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812008 (2024)
Three-dimensional Crack-Change Detection Based on Monocular Vision
Lei Liu, Yong Ding, and Denghua Li
The depth information of two-dimensional (2D) images cannot be captured via monocular-vision measurement technology, consequently, the three-dimensional (3D) coordinates in the 2D image cannot be measured rapidly and directly. Hence, this paper proposes a 3D crack-change detection method based on monocular vision. UsinThe depth information of two-dimensional (2D) images cannot be captured via monocular-vision measurement technology, consequently, the three-dimensional (3D) coordinates in the 2D image cannot be measured rapidly and directly. Hence, this paper proposes a 3D crack-change detection method based on monocular vision. Using an equivalent crack-change model, a special target was designed, feature points were used to solve the Epnp (efficient perspective-n-point), the relative pose of the camera during repeated photograph capturing was obtained, the depth information was restored using the least-squares method, and the actual 3D displacement change was obtained using the iterative closest point algorithm for coordinate-system conversion. The absolute-error accuracy was within 0.5 mm, which satisfies the requirements of cracks specified in engineering..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1812009 (2024)
Machine Vision
Monocular VI-SLAM Algorithm Based on Lightweight SuperPoint Network in Low-Light Environment
Xudong Zeng, Shaosheng Fan, Shangzhi Xu, and Yuting Zhou
Visual inertial simultaneous localization and mapping (SLAM) technology improves the accuracy of mapping and positioning by considering relevant visual and inertial constraints. However, in low-light environments, the quality of feature point extraction and tracking stability at the visual front-end are poor, which leaVisual inertial simultaneous localization and mapping (SLAM) technology improves the accuracy of mapping and positioning by considering relevant visual and inertial constraints. However, in low-light environments, the quality of feature point extraction and tracking stability at the visual front-end are poor, which leads to easy loss of tracking and low positioning accuracy in the visual inertial SLAM algorithm. Therefore, we propose a monocular inertial SLAM algorithm called GS-VINS based on the VINS-Mono framework. First, an adaptive image enhancement algorithm is used to enhance the grayscale distribution of low-light images. Then, a GN2_SuperPoint feature point detection network is proposed, and it is combined with a feature point dynamic tracking module to improve the stability of optical flow tracking. Experiments on the EuRoC dataset and in real-world scenarios show that the proposed algorithm improves localization accuracy by 26.57% compared to VINS-Mono and it demonstrates strong robustness to changes in lighting. In the comparison experiment, the success rate of the feature tracking increases by 8%, and the closure error in real-world scenarios is reduced by ~45.73%. The proposed algorithm shows good accuracy and stability in low-light environments and provides a novel solution for visual navigation under low-light conditions, thereby offering valuable engineering applications..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1815001 (2024)
Two-Step Relocalization Method for Laser Point Clouds Based on Semantic Graph and Semantic Scan Context
Xiaohong Huang, Yuhui Peng, and Wei Huang
This study proposes a creative method of semantic graph matching and semantic scan context descriptors of candidate frames to address the long-term localization issues in unmanned vehicle based on simultaneous localization and mapping maps. The relocalization of point cloud scenes is achieved through a two-step processThis study proposes a creative method of semantic graph matching and semantic scan context descriptors of candidate frames to address the long-term localization issues in unmanned vehicle based on simultaneous localization and mapping maps. The relocalization of point cloud scenes is achieved through a two-step process, involving coarse and fine localization. First, semantic and geometric features are extracted from the point cloud, eliminating mobile and movable objects. Thus, a semantic graph is constructed by fusing semantic information and topological relationships, and rapid relocalization coarse matching is realized through graph similarity calculation. Then, the relative yaw and horizontal movement between point clouds are computed through global semantic iterative closest point, providing a well-initialized alignment. Finally, the global semantic descriptor is generated through semantic scan context, and accurate relocalization is obtained by comparing descriptors to distinguish point cloud similarity. Experimental results demonstrate that the proposed method achieves a 20.10%, 20.90%, and 20.47% improvement in accuracy in place recognition, occluded scenes, and perspective change scenes, respectively, compared to the semantic graph-based place recognition method..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1815002 (2024)
Surgical Robotic Arm Guidance System Based on Point Laser Precise Navigation
Kefu Song, Rui Tang, Feifei Guo, Zexin Shen, Huixiong Zeng, and Jun Li
Automated surgical guidance systems are increasingly important in clinical settings, driven by advancements in image detection technologies and the growing demand for surgical procedures. However, the need for the system to have real-time visual precision guidance restricts the range of applications in clinical surgeryAutomated surgical guidance systems are increasingly important in clinical settings, driven by advancements in image detection technologies and the growing demand for surgical procedures. However, the need for the system to have real-time visual precision guidance restricts the range of applications in clinical surgery. When a visual signal guides the robotic arm for path planning, the inefficiency of traditional algorithms in low planning can hinder the real-time capability of the system. To address these problems, a navigation control system based on a point-laser-guided surgical robotic arm is proposed. The visual part is based on the YOLOv5 network and preprocessed using the super-resolution reconstruction algorithm. Fusion feature aggregation and single-scale recognition improvement strategies are proposed to achieve rapid and accurate point-laser tracking. For motion planning, a rapidly-exploring random tree (RRT) algorithm that integrates target bias and bidirectional expansion is proposed to constrain the target point attitude using lesion point cloud information for collision pre-detection and planning decision during path generation. The validity and feasibility of the proposed algorithm were verified through experiments, demonstrating that the optimized algorithm achieves an AP50 recognition accuracy of 97.6% and an AP75 recognition accuracy of 83.5%. Moreover, the improved RRT algorithm accurately and rapidly plans the optimal obstacle avoidance path, achieving a 7.2 percentage points improvement over YOLOv5 in traditional video target recognition..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1815003 (2024)
Camera Calibration Based on Improved Grey-Wolf Genetic Algorithm
Chunming Li, Lü Dayong, and Songling Yuan
To solve the problems of low calibration accuracy, inferior repeatability, and weak robustness in conventional camera calibration, an optimized camera calibration method based on an improved grey-wolf optimization algorithm is proposed. This method improves the population initialization, linear convergence factor, and To solve the problems of low calibration accuracy, inferior repeatability, and weak robustness in conventional camera calibration, an optimized camera calibration method based on an improved grey-wolf optimization algorithm is proposed. This method improves the population initialization, linear convergence factor, and position update strategy of the grey-wolf algorithm, as well as integrates search strategies based on dimension learning and improved selection, crossover, and mutation operators to optimize camera calibration parameters. First, the MATLAB calibration toolbox is used to extract the corner points of the calibration board image. Based on the camera calibration principle, the corresponding relationship between the corner point coordinates of the calibration board and the coordinates of three-dimensional points in space is established to obtain the initial estimation of the camera internal parameters and distortion coefficients. Accordingly, the optimization parameters to be optimized are set. Second, based on the initial estimation, an initial population for the grey-wolf genetic algorithm is generated within the optimization range. Next, an average reprojection-error equation is constructed, with the objective function of minimizing this error. The improved grey-wolf genetic optimization algorithm is used to optimize the calibration parameters. Finally, the method is experimentally compared with other optimization methods. The results show that the camera calibration method based on the improved grey-wolf genetic algorithm not only has the smallest average reprojection error but also exhibits the best repeatability and robustness..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1815004 (2024)
Airborne Laser Point-Cloud Filtering in Complex Mountainous Terrain Utilizing Deep Global Information Fusion
Jierui Cui, Yunwei Pu, Yan Xia, and Yichen Liu
LiDAR exhibits high non-ground point ratios and uneven density distributions when obtaining point-cloud data in areas with steep terrains and dense vegetation coverage. Classical filtering algorithms cannot readily obtain accurate point-cloud filtering results. In point-cloud filtering using deep learning, issues such LiDAR exhibits high non-ground point ratios and uneven density distributions when obtaining point-cloud data in areas with steep terrains and dense vegetation coverage. Classical filtering algorithms cannot readily obtain accurate point-cloud filtering results. In point-cloud filtering using deep learning, issues such as insufficient information utilization and inadequate feature extraction persist. Therefore, this study proposes a point-cloud filtering network that integrates multidimensional features and global contextual information (MGINet). It establishes a framework for multidimensional feature extraction and global information fusion to enhance the accuracy of point-cloud filtering in complex mountainous regions. MGINet begins by designing a local cross-feature fusion module, which combines normal vectors with spatial geometric structures to extract high-dimensional diverse features, thereby preserving the local spatial structure features of the point cloud. Subsequently, a global-context aggregation module is introduced to capture global contextual information, thus enhancing the generality of the features through cross-coding. Finally, experimental testing on both public and actual datasets from complex mountainous areas shows that MGINet outperforms classical algorithms in terms of point-cloud filtering accuracy..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1815005 (2024)
Multimodal LiDAR Enhancement Algorithm Based on Multiscale Features
Yikai Luo, Linyuan He, and shiping Ma
LiDAR is widely used to scan the surrounding environment, obtain measurement data, and construct a three-dimensional (3D) point cloud in vehicle environment perception tasks. However, it cannot perceive semantic information in the environment, which limits its effectiveness in 3D object detection. Consequently, in thisLiDAR is widely used to scan the surrounding environment, obtain measurement data, and construct a three-dimensional (3D) point cloud in vehicle environment perception tasks. However, it cannot perceive semantic information in the environment, which limits its effectiveness in 3D object detection. Consequently, in this study, we design a multi-modal fusion LiDAR-enhancement algorithm based on multiscale features and introduce some innovations under the Transformer framework to enhance the 3D object detection effect of LiDAR in complex environments. In the encoder, multiscale semantic features extracted by a semantic-aware aggregation module will be used for cross-modal feature fusion, whereas scale self-attention and proposal-guided initialization in the decoder will be used to make the prediction process more efficient. We also design a triangular loss function to improve the regression of the prediction box position, which restricts the regression position of the prediction box between 2D and 3D labels with triangular geometric constraints to obtain better prediction results. The experiments conducted on the nuScenes dataset have demonstrated the effectiveness and robustness of the proposed model..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1815006 (2024)
Medical Optics and Biotechnology
Classification of Microscopic Hyperspectral Images of Cancerous Tissue Based on Deep Learning
Yong Zhang, Danfei Huang, Lechao Zhang, Lili Zhang, Yao Zhou, and Hongyu Tang
Based on the idea of factorization neural network and residual structure, a convolutional block attention module for residual factorized of convolutional neural networks (CBAM-RFNet) is proposed by expansive convolution and adding attention mechanism. In this network, the traditional 3×3 two-dimensional convolutioBased on the idea of factorization neural network and residual structure, a convolutional block attention module for residual factorized of convolutional neural networks (CBAM-RFNet) is proposed by expansive convolution and adding attention mechanism. In this network, the traditional 3×3 two-dimensional convolution is decomposed into two one-dimensional convolution of 3×1 and 1×3 and connect them in series, which not only increases the depth of the network model, but also reduces the parameters, the network is a lightweight network model. The experimental results on thyroid cancer images collected by microhyperspectral imaging system show that, compared with other deep neural networks, the proposed network can effectively improve the classification accuracy of microhyperspectral images, with the overall accuracy of 98.23%, F1 value of 98.66%, and Kappa coefficient of 0.909..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1817001 (2024)
Remote Sensing and Sensors
Change Detection of Optical and Synthetic Aperture Radar Remote Sensing Images Based on a Domain Adaptive Neural Network
Qinfeng Yao, Yongxiang Ning, and Sunwen Du
To address the issues of original image feature loss and unexpected noise introduction in optical and synthetic aperture radar (SAR) remote sensing image change detection as well as to improve the quality and accuracy of remote sensing image change detection, a domain adaptive neural-network-based optical and SAR remotTo address the issues of original image feature loss and unexpected noise introduction in optical and synthetic aperture radar (SAR) remote sensing image change detection as well as to improve the quality and accuracy of remote sensing image change detection, a domain adaptive neural-network-based optical and SAR remote sensing image change detection method is proposed. Domain adaptive constraints were first introduced to align the extracted heterogeneous depth features to a common depth feature space, thereby improving the performance of heterogeneous image change detection. A final change map was then generated by inputting aligned depth features into the multi-scale decoder. Experiments were conducted to assess the effectiveness of the proposed method, wherein three typical datasets and six advanced detection methods were selected for comparative analysis. Experimental results show that the average accuracy, recall, segmentation performance, and weighted value performance of the proposed detection method on the three datasets are 80.81%, 84.39%, 73.67%, and 82.58%, respectively, which are better than those of the comparison methods..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1828001 (2024)
Road Extraction Method from Remote Sensing Images with Feature Consistency Perception
Xuyang Zhao, Feng Luo, Hui Yang, Biao Wang, Guangyao Ren, and Yongchuang Wu
Road extraction is an important topic in remote-sensing information extraction. However, for cases when buildings and trees obstruct roads, existing road extraction methods have a weak global consistency in sensing road features, resulting in fragmented road extraction results. A feature enhancement and consistency awaRoad extraction is an important topic in remote-sensing information extraction. However, for cases when buildings and trees obstruct roads, existing road extraction methods have a weak global consistency in sensing road features, resulting in fragmented road extraction results. A feature enhancement and consistency awareness network (FECP-Net) is proposed to address this issue. The network comprises an initial road extraction network (CRE-Net) and a feature enhancement and consistency awareness (FECP) module. In this network, CRE-Net extracts the initial road information and features. In contrast, the FECP module enhances the consistency of road features. It improves the completeness of road extraction results by connecting rough road information with road features of different scales. The proposed method was compared with other methods, namely, DGRN, U-Net, and D-LinkNet, on the CHT, Massachusetts, and DeepGlobal datasets. The results on the Massachusetts dataset showed that compared to other methods, the proposed method increased the intersection over union (IOU) by 0.45 percentage points, 3.36 percentage points, and 9.48 percentage points, respectively, the F1 scores increased by 1.26 percentage points, 2.76 percentage points, and 8.12 percentage points, respectively, and the recall rates increased by 4.60 percentage points, 5.93 percentage points, and 12.46 percentage points, respectively. The proposed method can extract the information of more complete roads and improve road fragmentation and disconnection extraction results..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1828002 (2024)
Remote Sensing Image-Matching Network Based on Multiscale Feature Fusion and Importance Ranking Loss
Peng Chen, Beiyuan Bao, and Xu Chen
Remote sensing image matching is one of the fundamental challenges in earth observation. The complexity and diversity of surface information in remote sensing images often pose difficulties for image matching. To overcome these difficulties, a remote sensing image-matching network based on multiscale feature fusion andRemote sensing image matching is one of the fundamental challenges in earth observation. The complexity and diversity of surface information in remote sensing images often pose difficulties for image matching. To overcome these difficulties, a remote sensing image-matching network based on multiscale feature fusion and importance ranking loss is proposed. This network comprises two parts: a key-point detection network and a feature descriptor extraction network. The key-point detection network has a multilayer convolutional structure based on feature pyramids. This structure is designed to achieve multiscale feature fusion at different network levels. Multiple convolution kernels are used to gradually expand receptive fields at the same level, thereby fully capturing multiscale information in remote sensing images. Furthermore, CBAM is used to aggregate the response graph of the key-point detection network to detect key points with significant scores. The key-point detection network is optimized using the score loss and image block loss, and the feature descriptor sub-extraction network is optimized using the descriptor subloss. The score-importance sorting loss function, descriptor sub-importance sorting loss function, and neighbor mask-based descriptor subloss function are specially designed to ensure that the key points, descriptors, and image blocks used for remote sensing image matching have high repeatability and distinguishability, which improves the accuracy of remote sensing image matching. In this study, many remote sensing images were collected and a remote sensing image-matching dataset was constructed via homography transformation. This dataset was used to experimentally verify the performance of the proposed network model. Compared with traditional image-matching methods or other end-to-end deep-learning image-matching methods, the proposed network model has considerable advantages in remote sensing image matching..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1828003 (2024)
Remote Sensing Object Detection Methods Based on Improved YOLOv5s
Kailun Cheng, Xiaobing Hu, Haijun Chen, and Hu Li
In order to solve the problems of dense arrangement of small targets and complex background area in remote sensing image object detection, the YOLOv5s model is improved. Backbone network adopts coordinated attention (CA) module with deep separable convolution, introduces the multi-dimensional attention mechanism of chaIn order to solve the problems of dense arrangement of small targets and complex background area in remote sensing image object detection, the YOLOv5s model is improved. Backbone network adopts coordinated attention (CA) module with deep separable convolution, introduces the multi-dimensional attention mechanism of channel and space, mines the correlation between spatial direction and position, and improves the ability of feature extraction and long-distance dependency capture. Neck network uses bidirectional feature pyramid network (BiFPN) structure to fully integrate the deep and shallow feature information to improve the feature fusion effect at different scales. Experimental results show that, for the remote sensing target dataset DIOR, compared with results of modle before improved, mean average precision (mAP) of the model is increased by 9.8 percentage points after the improvement. Average precision (AP) of all categories has been improved, and the value of most categories has increased by more than 5 percentage points. The precision is increased by 7.2 percentage points, the recall is increased by 10.8 percentage points, which alleviates the problems of missed detection and false detection, and enhances the detection effect of the model on dense small targets in complex backgrounds in remote sensing images..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1828004 (2024)
Remote-Sensing Scene Classification Based on Memristor Convolutional Neural Network
Yibo Zhao, Yi Zhang, Chengcheng Yu, and Qing Yang
Remote-sensing images typically present multiple scene categories, significant intraclass variance, and high interclass similarity. Conventional deep networks such as convolutional neural networks (CNNs) can neither adequately represent the features of target objects nor accurately distinguish between object and backgrRemote-sensing images typically present multiple scene categories, significant intraclass variance, and high interclass similarity. Conventional deep networks such as convolutional neural networks (CNNs) can neither adequately represent the features of target objects nor accurately distinguish between object and background information in remote-sensing scene images. Moreover, these networks typically exhibit large parameter sizes, thus resulting in low classification accuracy and inefficient training. Hence, a resistive CNN that can perform remote-sensing scene classification is proposed. A context-aware enhanced transformer module was introduced to fuse shared weights and context-aware weights for capturing both high- and low-frequency features. A multiscale selective kernel (SK) unit building block was integrated into the convolution block, and different convolution kernels were selected based on feature maps of different levels. Additionally, feature information of different scales was extracted to improve the processing ability of the model for complex scenes. Furthermore, a low-power and high-speed resistive CNN was constructed by weight mapping resistor crossbar arrays, thus reducing the computational overhead. Experimental results on the publicly available UCMercedLandUse dataset with 21 classes and the NWPU-RESISC45 dataset with 45 classes indicate classification accuracies of 94.76% and 87.54%, respectively. These accuracies represent improvements of 5.95 percentage points and 5.07 percentage points, respectively, compared with baseline models in addition to significantly reduced model parameters. The accuracy losses of the improved resistive CNN model on the two abovementioned datasets are only 0.24 percentage points and 0.23 percentage points, respectively. Thus, it is a promising model for promoting the advancement of edge computing..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1828006 (2024)
Target Localization Method Based on Aliased Data from Cross-Strip Anode Detectors
Yihang Zhai, and Bin Wang
When using cross-strip (XS) anode single-photon detectors for space target detection, a substantial amount of aliased data is generated, which cannot be directly utilized. To improve the speed and quality of target localization, it is crucial to improve data utilization efficiency and effectively use aliased data. ThisWhen using cross-strip (XS) anode single-photon detectors for space target detection, a substantial amount of aliased data is generated, which cannot be directly utilized. To improve the speed and quality of target localization, it is crucial to improve data utilization efficiency and effectively use aliased data. This involves modeling and analyzing the aliased data generated during the detection process, understanding the specific data structure, and using a centroid detection method that preserves the aliased data. In this study, we performed numerical simulations and actual optical path experiments to compare the detection results of methods that retain versus those that remove aliased data. In simulation and experimental settings, the method retained aliased data achieved lower detection errors. In experiments, retaining aliased data reduced the detection error by an average of 11.9% compared to the case of removing it, resulting in more accurate detection and faster, more efficient target localization..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1828007 (2024)
Reviews
Biological Applications of Fluorescence Polarization Imaging
Ziyi Yang, Shihan Li, Zhiru Liu, Suyi Zhong, Meiqi Li, and Peng Xi
Fluorescence polarization imaging can obtain the orientation and structural features of the sample by measuring the polarization parameters of fluorescent molecules, owing to which it is widely used in biology research. Moreover, it is often combined with other fluorescence microscopy techniques to analyze the arrangemFluorescence polarization imaging can obtain the orientation and structural features of the sample by measuring the polarization parameters of fluorescent molecules, owing to which it is widely used in biology research. Moreover, it is often combined with other fluorescence microscopy techniques to analyze the arrangement of cellular structures, the real-time dynamic orientation of biomolecules, and the organization of tissues. This approach enables research on cellular physiological processes, the effects of drugs on cells, and the detection of abnormal structures. In this study, we focus on the applications of fluorescence polarization imaging in biology research, including the analysis and comparison of different methods that utilize polarization information to obtain orientation structures of different samples. Furthermore, we provide insights into future development trends in this field..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1800001 (2024)
Principle and Clinical Quantitative Analysis of Optical Coherence Tomography Angiography
Di Wang, Ting Zhang, Shanshan Liang, and Jun Zhang
Optical coherence tomography angiography (OCTA), derived from optical coherence tomography (OCT) technology, is a vascular imaging approach that leverages flowing red blood cells as an intrinsic contrast agent. It has key advantages, including high resolution, sensitivity, and noninvasiveness. By discerning the amplituOptical coherence tomography angiography (OCTA), derived from optical coherence tomography (OCT) technology, is a vascular imaging approach that leverages flowing red blood cells as an intrinsic contrast agent. It has key advantages, including high resolution, sensitivity, and noninvasiveness. By discerning the amplitude and phase variances within OCT signals between flowing red blood cells and stationary tissue, OCTA enables 3D imaging of the retinal vascular network. Leveraging vascular density, diameter, perimeter, and complexity indices, OCTA is a screening tool for ocular pathologies such as diabetic retinopathy (DR) and venous occlusion (VO), exhibiting notable efficacy in clinical interventions. This article provides an overview of the evolution of OCTA, elucidates its algorithmic fundamentals, delineates clinical quantitative metrics, and briefly evaluates the comparison between various OCTA methods and standard quantitative indicators..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1800002 (2024)
Review of Light Field Super-Resolution Algorithm Based on Deep Learning
Yawei Xiong, Anzhi Wang, and Kaili Zhang
The trade-off between spatial and angular resolutions is one of the reasons for low-resolution light field images. Light field super-resolution techniques aim to reconstruct high-resolution light field images from low-resolution light field images. Deep learning-based light field super-resolution methods improve the quThe trade-off between spatial and angular resolutions is one of the reasons for low-resolution light field images. Light field super-resolution techniques aim to reconstruct high-resolution light field images from low-resolution light field images. Deep learning-based light field super-resolution methods improve the quality of images by learning the mapping relationship between high- and low-resolution light field images. This advantage breaks through the limitations of traditional methods with high computational cost and complex operation. This paper provides a comprehensive overview of the research progress of deep learning-based light field super-resolution technology in recent years. The network framework and typical algorithms are examined, and experimental comparative analysis is conducted. Furthermore, the challenges faced in the area of light field super-resolution are summarized, and the future development direction is anticipated..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1800003 (2024)
Progress in Research and Application of Image Stitching Technology Based on Regional Optimization
Weidong Pan, Anhu Li, and Xingsheng Liu
Optoelectronic imaging technology is crucial for obtaining environmental information and perceiving the physical world. This technology is widely used in diverse fields, such as aerial remote sensing observation, agricultural resource monitoring, industrial defect detection, and biomedical diagnosis. With the continuouOptoelectronic imaging technology is crucial for obtaining environmental information and perceiving the physical world. This technology is widely used in diverse fields, such as aerial remote sensing observation, agricultural resource monitoring, industrial defect detection, and biomedical diagnosis. With the continuous expansion of application scenarios, optoelectronic imaging technology often faces the challenge of balancing a large field of view with high resolution. Therefore, developing accurate and efficient image stitching technology is necessary. This study elucidates the theoretical model of image stitching and classifies and introduces the basic principles and implementation methods of image stitching technology based on regional optimization. This study also analyzes the advantages and limitations of existing regional optimization-based image stitching methods in terms of method characteristics, stitching effect, and time efficiency. By elucidating the application of image stitching technology in typical fields, this paper summarizes the technical challenges and application prospects of image stitching and explores its development in adapting to complex application scenarios, integrating deep learning methods, and improving information dimensions..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1800004 (2024)
Scattering
Speckle-Field Focusing Based on Estimation of Distribution Algorithm
Huiling Huang, Chengcheng Chang, and Jun Han
When a laser beam passes through a strong scattering medium such as white paint or milk, multiple scatterings will occur, thus resulting in the formation of random speckles. In this study, a distribution-estimation algorithm was proposed to focus a beam through a scattering medium via phase and amplitude modulations. TWhen a laser beam passes through a strong scattering medium such as white paint or milk, multiple scatterings will occur, thus resulting in the formation of random speckles. In this study, a distribution-estimation algorithm was proposed to focus a beam through a scattering medium via phase and amplitude modulations. The effects of the total number of modulating units and the number of iterations on the focusing effect were analyzed theoretically. Furthermore, phase modulation was performed using distribution-estimation, genetic, particle-swarm-optimization, and continuous sequence algorithms. Subsequently, changes in the light-intensity enhancement factor at the target position were observed. The result shows that, compared with other algorithms, the distribution-estimation algorithm accelerates modulation convergence, enhances noise resilience, and offers superior focusing effects..
Laser & Optoelectronics Progress
- Publication Date: Sep. 25, 2024
- Vol. 61, Issue 18, 1829001 (2024)