
Search by keywords or author
Multi-Scale Fusion Optimization Algorithm for Printed Circuit Board Defect Detection
Kun Mao, Xuejun Zhu, Huige Lai, Checao Yu... and Da Peng|Show fewer author(s)
A printed circuit board (PCB) defect detection algorithm based on multi-scale fusion optimization is proposed to address the low accuracy of traditional detection algorithms, which struggle with small surface defects resembling background features. Building on YOLOv8, a Swin Transformer module is integrated at the end of the backbone network's feature fusion layer to capture global information and enhance the understanding of both detailed and overall features. A global attention mechanism is embedded in the backbone to focus on target areas and reduce background interference. The WIoU loss function replaces the original CIoU, incorporating differential weighting to improve regression performance for small targets and complex backgrounds. Comparative experiments are conducted using different algorithms on the PCB_DATASET and DeepPCB datasets. The proposed algorithm improves detection accuracy by 3.64 and 2.42 percentage points on the PCB_DATASET and DeepPCB datasets, respectively, significantly enhancing defect recognition accuracy.A printed circuit board (PCB) defect detection algorithm based on multi-scale fusion optimization is proposed to address the low accuracy of traditional detection algorithms, which struggle with small surface defects resembling background features. Building on YOLOv8, a Swin Transformer module is integrated at the end of the backbone network's feature fusion layer to capture global information and enhance the understanding of both detailed and overall features. A global attention mechanism is embedded in the backbone to focus on target areas and reduce background interference. The WIoU loss function replaces the original CIoU, incorporating differential weighting to improve regression performance for small targets and complex backgrounds. Comparative experiments are conducted using different algorithms on the PCB_DATASET and DeepPCB datasets. The proposed algorithm improves detection accuracy by 3.64 and 2.42 percentage points on the PCB_DATASET and DeepPCB datasets, respectively, significantly enhancing defect recognition accuracy.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1012004 (2025)
- DOI:10.3788/LOP250474
Pulse-Coupled Dual Adversarial Learning Network for Infrared and Visible Image Fusion
Jia Zhao, Yuelan Xin, Jizhao Liu, and Qingqing Wang
To address the issue of insufficient extraction and fusion of complementary information in infrared and visible image fusion, this study proposes a pulse-coupled dual adversarial learning network. The network utilizes dual discriminators that target infrared objects and visible texture details in the fused images, with the goal of preserving and enhancing modality-specific features. We also introduce a pulse-coupled neural network featuring a combined learning mechanism to effectively extract salient features and detailed information from the images. During the fusion stage, we implement a cross-modality fusion module guided by cross-attention, which further optimizes the complementary information between modalities and minimizes redundant features. We conducted comparative qualitative and quantitative analyses against nine representative fusion methods in the TNO, M3FD, and RoadScene datasets. Results show that the proposed method demonstrates superior performance in evaluation metrics, such as mutual information and sum of correlation differences. The method produces fused images with high contrast and rich detail and achieves better results in target detection tasks.To address the issue of insufficient extraction and fusion of complementary information in infrared and visible image fusion, this study proposes a pulse-coupled dual adversarial learning network. The network utilizes dual discriminators that target infrared objects and visible texture details in the fused images, with the goal of preserving and enhancing modality-specific features. We also introduce a pulse-coupled neural network featuring a combined learning mechanism to effectively extract salient features and detailed information from the images. During the fusion stage, we implement a cross-modality fusion module guided by cross-attention, which further optimizes the complementary information between modalities and minimizes redundant features. We conducted comparative qualitative and quantitative analyses against nine representative fusion methods in the TNO, M3FD, and RoadScene datasets. Results show that the proposed method demonstrates superior performance in evaluation metrics, such as mutual information and sum of correlation differences. The method produces fused images with high contrast and rich detail and achieves better results in target detection tasks.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1037008 (2025)
- DOI:10.3788/LOP242143
Tightly Coupled SLAM Algorithm for Lidar-IMU Applicable to Semisolid Lidar
Fan Zhang, and Wanyue Jiang
To address the issue of adaptability to the new semisolid lidar and unsatisfactory robustness in degradation environments in current studies pertaining to laser simultaneous localization and mapping (SLAM), a geometric feature extraction method is proposed, where the features are stored in voxel grids. By selecting features based on the curvature information of each voxel, one can effectively extract the desired planar features while maintaining the accuracy even when using non-periodic scanning patterns. Compared with neighborhood search methods based on k-dimensional trees, neighborhood search based on voxel grids is more efficient and significantly reduces the computing time. By using a graph optimization algorithm framework, the modules can be set more flexibly and excellent global optimization results can be obtained. Experimental results on the VECtor public dataset and a self-developed dataset are analyzed, which show that in indoor environments, the proposed algorithm offers higher positioning accuracies by approximately 48% and 64% compared with FAST-LIO2 and iG-LIO, respectively, and a lower single-frame time by 42% compared with LIO-SAM. The experimental results show that the proposed algorithm passes all the specified test sequences, thus demonstrating its superior comprehensive performance.To address the issue of adaptability to the new semisolid lidar and unsatisfactory robustness in degradation environments in current studies pertaining to laser simultaneous localization and mapping (SLAM), a geometric feature extraction method is proposed, where the features are stored in voxel grids. By selecting features based on the curvature information of each voxel, one can effectively extract the desired planar features while maintaining the accuracy even when using non-periodic scanning patterns. Compared with neighborhood search methods based on k-dimensional trees, neighborhood search based on voxel grids is more efficient and significantly reduces the computing time. By using a graph optimization algorithm framework, the modules can be set more flexibly and excellent global optimization results can be obtained. Experimental results on the VECtor public dataset and a self-developed dataset are analyzed, which show that in indoor environments, the proposed algorithm offers higher positioning accuracies by approximately 48% and 64% compared with FAST-LIO2 and iG-LIO, respectively, and a lower single-frame time by 42% compared with LIO-SAM. The experimental results show that the proposed algorithm passes all the specified test sequences, thus demonstrating its superior comprehensive performance.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1015010 (2025)
- DOI:10.3788/LOP242079
Dual-Branch Chronological Clustering Network for Bronze Inscriptions with Improved Multiscale CBAM
Jingwen Ding, Ying Lu, Huiqin Wang, Ke Wang, and Zhan Wang
Bronze inscriptions are invaluable for studying ancient politics, economy, and culture. However, minimal stylistic variations and the predominance of unlabeled data in unearthed inscriptions pose challenges for computer-aided inscription analysis. To address this issue, a bronze inscription age clustering network based on a deep unsupervised clustering model is proposed. In the first stage, a ResNet50-based feature extraction module is constructed, incorporating an improved multiscale CBAM attention mechanism. This enhancement allows the network to simultaneously capture detailed and global features, thereby overcoming the limitations of traditional feature extraction methods that struggle with incomplete feature representation for inscriptions of similar ages. In the second stage, K-means clustering is applied to the extracted features. The clustering branch results serve as pseudo-labels, which are then used to compute the cross-entropy loss against the predictions of the model's prediction branch. In the third stage, iterative training is performed using cross-entropy loss backpropagation to continuously optimize the model parameters, enhancing the accuracy of feature extraction and clustering. The experimental results demonstrate that the proposed network achieves an overall accuracy of 89.43% on the standard inscription dataset, surpassing traditional unsupervised clustering networks by more than 14%.Bronze inscriptions are invaluable for studying ancient politics, economy, and culture. However, minimal stylistic variations and the predominance of unlabeled data in unearthed inscriptions pose challenges for computer-aided inscription analysis. To address this issue, a bronze inscription age clustering network based on a deep unsupervised clustering model is proposed. In the first stage, a ResNet50-based feature extraction module is constructed, incorporating an improved multiscale CBAM attention mechanism. This enhancement allows the network to simultaneously capture detailed and global features, thereby overcoming the limitations of traditional feature extraction methods that struggle with incomplete feature representation for inscriptions of similar ages. In the second stage, K-means clustering is applied to the extracted features. The clustering branch results serve as pseudo-labels, which are then used to compute the cross-entropy loss against the predictions of the model's prediction branch. In the third stage, iterative training is performed using cross-entropy loss backpropagation to continuously optimize the model parameters, enhancing the accuracy of feature extraction and clustering. The experimental results demonstrate that the proposed network achieves an overall accuracy of 89.43% on the standard inscription dataset, surpassing traditional unsupervised clustering networks by more than 14%.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1037003 (2025)
- DOI:10.3788/LOP242364
High-Precision Positioning Method for High-Resolution Optical Satellite Imagery Accelerated by PCG-GPU Without GCPs
Qing Fu, Jun Chen, Weijian Liang, and Wenlang Luo
This study introduces a high-precision geometric positioning method for high-resolution optical satellite imagery that does not rely on ground control points (GCPs), utilizing preconditioned conjugate gradient (PCG) and graphics processing unit (GPU) acceleration. The core technologies used in this method include an adjustment model that is constructed based on virtual control points (VCPs), a sparse matrix storage format, and parallel block adjustment accelerated by PCG-GPU. By employing a sparse matrix storage format to reduce computer memory requirements, PCG-GPU parallel acceleration technology enhances the efficiency of block adjustment parameter processing. Experimental verification is performed using 829 Ziyuan-3 (ZY-3) satellite images from Jiangxi area. The results show that the proposed PCG-GPU accelerated parallel block adjustment method is approximately 9.5 times more efficient than traditional serial computing methods. In addition, following block adjustment, the root mean square error (RMSE) is 0.461 pixel and 0.652 pixel in the x and y directions, respectively, which meets the stringent accuracy requirements for high-precision satellite image mapping.This study introduces a high-precision geometric positioning method for high-resolution optical satellite imagery that does not rely on ground control points (GCPs), utilizing preconditioned conjugate gradient (PCG) and graphics processing unit (GPU) acceleration. The core technologies used in this method include an adjustment model that is constructed based on virtual control points (VCPs), a sparse matrix storage format, and parallel block adjustment accelerated by PCG-GPU. By employing a sparse matrix storage format to reduce computer memory requirements, PCG-GPU parallel acceleration technology enhances the efficiency of block adjustment parameter processing. Experimental verification is performed using 829 Ziyuan-3 (ZY-3) satellite images from Jiangxi area. The results show that the proposed PCG-GPU accelerated parallel block adjustment method is approximately 9.5 times more efficient than traditional serial computing methods. In addition, following block adjustment, the root mean square error (RMSE) is 0.461 pixel and 0.652 pixel in the x and y directions, respectively, which meets the stringent accuracy requirements for high-precision satellite image mapping.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1028005 (2025)
- DOI:10.3788/LOP242166
Single Image Dehazing Method Guided by Edge Prior Information
Wan Liang, Meizhen Huang, and Guilin Xu
Aiming at the problems of edge blurring and loss of edge details that occur in the convolutional neural network image dehazing method when dealing with the image edge texture, in this study, a single image dehazing network design method, guided by multilevel edge a priori information, is proposed. The design method integrates an edge feature extraction block, an edge feature fusion block, and a dehazing feature extraction block, which performs rich edge feature extraction on foggy images and reconstructs the edge image. Furthermore, the edge feature fusion block efficiently fuses the edge a priori information with the context information of the foggy image at multiple levels. Then, the dehazing feature extraction block performs multiscale deep feature extraction on the image and adds attention mechanism to the important channels. A large number of experiments are conducted on the RESIDE dataset and compared with the mainstream dehazing methods, in which the peak signal-to-noise ratio and structural similarity index measurement of the indoor dataset reach 37.58 and 0.991, respectively. Additionally, the number of parameters and amount of computation are only 2.024×106 and 24.84×109, which shows that the method in this study effectively defogs the image while reducing the number of parameters and amount of computation. Moreover, the method exhibits good performance and edge detail preservation ability.Aiming at the problems of edge blurring and loss of edge details that occur in the convolutional neural network image dehazing method when dealing with the image edge texture, in this study, a single image dehazing network design method, guided by multilevel edge a priori information, is proposed. The design method integrates an edge feature extraction block, an edge feature fusion block, and a dehazing feature extraction block, which performs rich edge feature extraction on foggy images and reconstructs the edge image. Furthermore, the edge feature fusion block efficiently fuses the edge a priori information with the context information of the foggy image at multiple levels. Then, the dehazing feature extraction block performs multiscale deep feature extraction on the image and adds attention mechanism to the important channels. A large number of experiments are conducted on the RESIDE dataset and compared with the mainstream dehazing methods, in which the peak signal-to-noise ratio and structural similarity index measurement of the indoor dataset reach 37.58 and 0.991, respectively. Additionally, the number of parameters and amount of computation are only 2.024×106 and 24.84×109, which shows that the method in this study effectively defogs the image while reducing the number of parameters and amount of computation. Moreover, the method exhibits good performance and edge detail preservation ability.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1037010 (2025)
- DOI:10.3788/LOP242274
Lightweight Dental Image Segmentation with Quadrant Oblique Displacement
Ziyuan Yin, and Yun Wu
The automatic segmentation of dental images plays a crucial role in the auxiliary diagnosis of oral diseases. To address the issues of large parameter sizes in existing segmentation models and low segmentation accuracy of medical dental images, a lightweight dental image segmentation model, namely, the quadrant oblique displacement (QOD) UNeXt is proposed. First, QOD blocks are designed to displace features along four oblique directions, that is, the upper-left, upper-right, lower-left, and lower-right, to diffuse features and dynamically aggregate tokens, which thereby enhances segmentation accuracy. Second, a localized feature integration (LFI) module is incorporated into the decoder to improve the ability of the model to integrate detailed and global information. Finally, an efficient channel attention (ECA) module is introduced at the skip connections to further fuse local and global features. Experimental results on the STS-MICCAI 2023 and Tufts public datasets demonstrate that QOD-UNeXt significantly improves segmentation accuracy while maintaining a lightweight structure. Therefore, QOD-UNeXt exhibits excellent performance in dental medical image segmentation tasks.The automatic segmentation of dental images plays a crucial role in the auxiliary diagnosis of oral diseases. To address the issues of large parameter sizes in existing segmentation models and low segmentation accuracy of medical dental images, a lightweight dental image segmentation model, namely, the quadrant oblique displacement (QOD) UNeXt is proposed. First, QOD blocks are designed to displace features along four oblique directions, that is, the upper-left, upper-right, lower-left, and lower-right, to diffuse features and dynamically aggregate tokens, which thereby enhances segmentation accuracy. Second, a localized feature integration (LFI) module is incorporated into the decoder to improve the ability of the model to integrate detailed and global information. Finally, an efficient channel attention (ECA) module is introduced at the skip connections to further fuse local and global features. Experimental results on the STS-MICCAI 2023 and Tufts public datasets demonstrate that QOD-UNeXt significantly improves segmentation accuracy while maintaining a lightweight structure. Therefore, QOD-UNeXt exhibits excellent performance in dental medical image segmentation tasks.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1037007 (2025)
- DOI:10.3788/LOP242111
Pavement Crack Segmentation Detection Integrating Multiple Attention Mechanisms
Pengfei Gao, Liya Zhang, Yukun Wang, and Lin Zhang
Pavement cracks can affect driving safety and service life, thereby increasing the risk of traffic accidents. Therefore, detecting and managing pavement cracks in a timely manner is particularly important. To address the problems of limited receptive field, inability to add location information, and poor effectiveness of traditional convolutional neural networks, a pavement crack segmentation model that integrates multiple attention mechanisms is proposed. ResNeSt and Swin Transformer enhance the information transmission effect of the model for the network to better utilize information at different levels and generate more accurate predictions. Among public online datasets, a dataset with 8251 real road images is used for the experiment, obtaining intersection over union, precision, recall, and F1 score values of 73.24%, 82.84%, 86.12%, and 84.44%, respectively. Although the recall is slightly inferior to that of DeepLab v3+, better performance is exhibited in terms of crack recognition accuracy and robustness in complex road environments.Pavement cracks can affect driving safety and service life, thereby increasing the risk of traffic accidents. Therefore, detecting and managing pavement cracks in a timely manner is particularly important. To address the problems of limited receptive field, inability to add location information, and poor effectiveness of traditional convolutional neural networks, a pavement crack segmentation model that integrates multiple attention mechanisms is proposed. ResNeSt and Swin Transformer enhance the information transmission effect of the model for the network to better utilize information at different levels and generate more accurate predictions. Among public online datasets, a dataset with 8251 real road images is used for the experiment, obtaining intersection over union, precision, recall, and F1 score values of 73.24%, 82.84%, 86.12%, and 84.44%, respectively. Although the recall is slightly inferior to that of DeepLab v3+, better performance is exhibited in terms of crack recognition accuracy and robustness in complex road environments.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1012002 (2025)
- DOI:10.3788/LOP242068
Enhancing PointPillars Three-Dimensional Object Detection with Density Clustering and Dual Attention Mechanisms
Qingxin Yang, Deming Kong, Jing Chen, Xiaowei Li, and Yue Shen
To address the problem of low detection accuracy for cars and cyclists in the PointPillars three-dimensional (3D) object detection network, an improved PointPillars method based on density clustering and a dual attention mechanism is proposed. This method improves PointPillars in two key areas: 1) introducing a density clustering algorithm in the point cloud processing module to screen and filter out non-clustered points, reduce the influence of irrelevant point cloud information while preserving effective point clouds as much as possible; 2) integrating an attention mechanism in the column feature extraction module of the column feature extraction network. A self-attention mechanism is used to establish connections between points within columns, and a cross-attention mechanism is employed to strengthen connections between columns after column feature extraction, thereby expanding the receptive field and enabling better focus on crucial point cloud information while preserving directional data. The experimental results obtained on the KITTI autonomous driving dataset reveal that PointPillars++ improves the 3D car detection accuracy and the average directional similarity (AOS). Compared to the original network, improvements in detection accuracy were 2.41, 3.48, and 4.87 percentage points, and AOS improved by 2.47, 2.06, and 0.74 percentage points across the simple, moderate and difficult levels, respectively. For 3D cyclist detection, accuracy and AOS increased by 6.26, 1.40, and 1.64 percentage points, and by 6.13, 6.53, and 6.37 percentage points, respectively, across the same difficulty levels.To address the problem of low detection accuracy for cars and cyclists in the PointPillars three-dimensional (3D) object detection network, an improved PointPillars method based on density clustering and a dual attention mechanism is proposed. This method improves PointPillars in two key areas: 1) introducing a density clustering algorithm in the point cloud processing module to screen and filter out non-clustered points, reduce the influence of irrelevant point cloud information while preserving effective point clouds as much as possible; 2) integrating an attention mechanism in the column feature extraction module of the column feature extraction network. A self-attention mechanism is used to establish connections between points within columns, and a cross-attention mechanism is employed to strengthen connections between columns after column feature extraction, thereby expanding the receptive field and enabling better focus on crucial point cloud information while preserving directional data. The experimental results obtained on the KITTI autonomous driving dataset reveal that PointPillars++ improves the 3D car detection accuracy and the average directional similarity (AOS). Compared to the original network, improvements in detection accuracy were 2.41, 3.48, and 4.87 percentage points, and AOS improved by 2.47, 2.06, and 0.74 percentage points across the simple, moderate and difficult levels, respectively. For 3D cyclist detection, accuracy and AOS increased by 6.26, 1.40, and 1.64 percentage points, and by 6.13, 6.53, and 6.37 percentage points, respectively, across the same difficulty levels.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1012001 (2025)
- DOI:10.3788/LOP240732
Multiscale Regional-Attention Stacked-Object Grasp Detection Network
Shengjun Xu, Zhiwei Cui, Ya Shi, Xiaohan Li... and Abdelhamid Hameg|Show fewer author(s)
Aiming at the problem that it is difficult to recognize object grasp points because of overlap or occlusion between multiple objects in stacked scenes, a multiscale regional-attention stacked-object grasp detection network is proposed. First, a multiscale regional-attention feature fusion module is proposed based on the feature pyramid architecture, which improves the network's ability to pay attention to different feature dimensions by introducing deformable convolution and full convolution. Second, a multiscale region-attention mechanism is used to decouple the grabbable area from the background in the stacked scene image. Different regions of different scale feature maps are weighted gradually to improve the network's ability to pay attention to the saliency of the grabbable area and its background-noise anti-interference ability. Finally, a double sampling region candidate module is proposed to further refine the candidate anchor boxes on the basis of the target ground truth, eliminate a large number of negative samples, and thus improve the quality of the candidate anchor boxes. The final grasp detection results are output by the classification regression module. Stacked-object grasp detection accuracy experiments are carried out on the VMRD and Cornell datasets. The experimental results show that the average detection accuracy of the proposed network on the VMRD dataset is 98.18%, whereas it is 98.0% on the Cornell dataset. The proposed network has accurate grasp detection effect and strong robustness in complex scenes.Aiming at the problem that it is difficult to recognize object grasp points because of overlap or occlusion between multiple objects in stacked scenes, a multiscale regional-attention stacked-object grasp detection network is proposed. First, a multiscale regional-attention feature fusion module is proposed based on the feature pyramid architecture, which improves the network's ability to pay attention to different feature dimensions by introducing deformable convolution and full convolution. Second, a multiscale region-attention mechanism is used to decouple the grabbable area from the background in the stacked scene image. Different regions of different scale feature maps are weighted gradually to improve the network's ability to pay attention to the saliency of the grabbable area and its background-noise anti-interference ability. Finally, a double sampling region candidate module is proposed to further refine the candidate anchor boxes on the basis of the target ground truth, eliminate a large number of negative samples, and thus improve the quality of the candidate anchor boxes. The final grasp detection results are output by the classification regression module. Stacked-object grasp detection accuracy experiments are carried out on the VMRD and Cornell datasets. The experimental results show that the average detection accuracy of the proposed network on the VMRD dataset is 98.18%, whereas it is 98.0% on the Cornell dataset. The proposed network has accurate grasp detection effect and strong robustness in complex scenes.
- May. 25, 2025
- Laser & Optoelectronics Progress
- Vol. 62, Issue 10, 1015009 (2025)
- DOI:10.3788/LOP241866
- <
- 1
- 2
- 3
- ...
- 14881
- >
Tunneling Modulated Multilevel Nociceptor Analogs
Acta Optica Sinica, Vol. 45,Issue 1, 0117001 (2025)
Chinese Optics Letters, Vol. 23,Issue 2, 020601 (2025)
AI-enabled universal image-spectrum fusion spectroscopy based on self-supervised plasma modeling
Advanced Photonics Nexus, Vol. 3,Issue 6, 066014 (2024)
Coupling ideality of standing-wave supermode microresonators
Photonics Research, Vol. 12,Issue 8, 1610 (2024)
Stimulation and imaging of neural cells via photonic nanojets
Photonics Research, Vol. 12,Issue 8, 1604 (2024)
Spectral programmable mid-infrared optical parametric oscillator
Photonics Research, Vol. 12,Issue 8, 1593 (2024)