
- Chinese Optics Letters
- Vol. 22, Issue 4, 041101 (2024)
Abstract
1. Introduction
Laser-induced damages (LIDs) are a bottleneck in the output capability of high-power laser facilities, such as the inertial confinement fusion (ICF) facility. Over the past decades, in response to LIDs, optical elements have been continuously improved to reduce microscopic defects and impurities, while the aperture has become larger and larger in order to reduce laser power per unit area. However, these methods make the production and processing costs higher, and the laser system fragile and hard to maintain.
The size of LIDs is usually very small in the initial stage (
Figure 1.Examples of LIDs on an optical element. Images above the dark-field image are the corresponding LIDs captured by a microscope.
However, due to diffraction, optical aberration, and the slight defocus caused by poor focusing conditions of dark-field imaging, the imaging system suffers from blur degradation, which is expressed as
Sign up for Chinese Optics Letters TOC. Get the latest issue of Chinese Optics Letters delivered right to you!Sign up now
The degradation reduces the resolution of the image. Superresolution (SR) is required to improve the image resolution. But the sparse and tiny LIDs distributing in the dark background with little texture information cause more challenges to current image SR reconstruction methods.
Non-blind deconvolution is a suitable method[2]. However, although the point spread function (PSF) is known, the PSF is usually band-limited (PSF itself also has accuracy and resolution limitations), so its frequency response shows zero or near-zero values at high frequencies. As a result, direct inversion of the PSF with the degraded image may cause large signal amplification at the high-frequency component, which manifests as ringing and amplified noise near the edges. Therefore, strong regularization is needed[3,4].
Non-blind deconvolution using deep learning[2] is superior to iterative deconvolution methods because it can tolerate greater model mismatch such as out-of-focus images, optical aberration, and PSF error[5].
In 2020, a two-channel UNet[6] model was proposed to solve the corrected images by separating the variables method and constructing multiple sub-optimization problems to solve them, which is relatively complex, although it works well. In 2021, the DWDN[7] method explored Wiener deconvolution in deep feature space. It first performed Wiener deconvolution and then image reconstruction by an end-to-end network and obtained good results. However, the method performs poorly when the image contains many saturated regions or is heavily blurred. In 2021, a Uformer network was proposed that obtained good results on image reconstruction tasks but was computationally intensive. In 2022, a GAN-based deep convolutional neural network[5] was proposed to perform deconvolution and SR on microscopy data encoded by known PSFs. The network uses simulated paired images based on known light propagation physics. It further uses unpaired real images for saliency checking; this thus reduces artifacts. But the GAN-based method is not easy to train and suffers from being crash-prone.
Inspired by NAFNet[9], which obtained state-of-the-art results on the GoPro[10] data set and MIMO-UNet[11], which has an attractive feature fusion method, we propose a multichannel and multifrequency mixing deconvolution method that is suitable for the conditions of limited resolution, dark background, less texture, sparse and small lateral size, slight defocus, and PSF error of the damage detection image. This method can solve the problem of pixel alignment of high-low resolution sample pairs under such adverse conditions and achieve fine and reliable SR reconstruction results.
2. Method
2.1. Principle of the proposed method
The proposed method consists of six steps, as shown in Fig. 2. In Step 1, we use a low-resolution imaging system to capture the images of a Bernoulli (0.5) random noise calibration pattern. We use the nonparametric subpixel local PSF estimation method[12] to achieve high-precision PSF estimation of the low-resolution imaging system by calculating the tessellation angle of the calibration pattern using the line segment detection (LSD) algorithm, estimating the geometric variation of the camera, estimating and compensating illumination using the pattern matched with gray level, correcting the captured image to achieve an artifact-free PSF estimation using the sensor nonlinearity, as shown in Fig. 3. The relative error of the PSF estimation can be strictly limited to 2%–5%. Considering the inconsistency of the PSF at different positions in the image plane laterally, and in order to allow our model to learn the inverse process of the PSF at different lateral spatial positions in the image plane, we divide the imaging range into nine small regions and place the calibration pattern at random locations within each region, as shown in Fig. 4. We obtain nine laterally spatially varying PSFs. Considering that a slight defocus problem exists caused by poor focusing conditions of dark-field imaging, we change the distance between the optical element and the camera (with 10 mm steps in the range of 215–255 mm). We got 45 PSFs representing different distance and spatial positions.
Figure 2.Principle of the multichannel and multifrequency mixing deconvolution method.
Figure 3.Workflow of the PSF measurement algorithm[
Figure 4.Principle of the PSF measurement considering the inconsistency at different positions and slight defocus.
In Step 2, we use a high-resolution imaging system to capture an image of the optical element, and the image is taken as the GT image.
In Step 3, we use these PSFs to convolve with the GT image to simulate the degradation effect image of the low-resolution imaging system. These calculated degraded images are naturally aligned with the GT image, and the magnification is consistent, forming a high-low resolution sample pair data set.
In Step 4, an end-to-end neural network (named MMFDNet) is proposed to carry out SR reconstruction of low-resolution images. We propose using a multifrequency information mixing block to cascade the whole network in the vertical direction, allowing flexible information delivery of features from top to bottom and from bottom to top. In order to enhance the feature information representation during the training phase, the network inputs the calculated low-resolution images, extracts multichannel and multifrequency information, and then gradually reconstructs the low-, intermediate-, and high-frequency information of the SR images.
In Step 5, the reconstructed results are supervised by the GT image using loss function and residuals backpropagation, so that the effects of defocus, diffraction, and noise degradation are removed from the results.
2.2. Architecture of the proposed network
The structure of the deconvolution network (MMFDNet) proposed in this article is shown in Fig. 5; it is essentially an improved UNet[8], where
Figure 5.Sketch of the MMFDNet network structure.
The skip connections in existing improved UNet networks do not fully utilize different frequency information in the decoder image reconstruction process. In order to apply the information extracted by the encoder at different frequencies to the reconstruction process of each frequency of the decoder, this paper designs a multiscale channel feature fusion (MCFF) module, as shown in Fig. 5(d). We cascade the entire network in a vertical direction to achieve multiscale feature cross fusion, allowing for information flow from top to bottom and bottom to top. Multiple scale features can be fused at specified scales, enabling different scales to participate in calculations together. Compared to the original skip connection, it adds processing modules for fused features and has stronger representation capabilities. In addition, the MCFF module uses a resize operation to unify the scales to the corresponding dimensions, merge features together, and achieve the superposition of channel numbers, which can completely preserve all channel information and facilitate SR reconstruction.
In order to moderately alleviate the complexity increase caused by the above mixing, an NAF block was designed in the network, consisting of layer normalization (LayerNorm), a convolutional layer, a SimpleGate activation function layer, and a simplified channel attention (SCA) layer, as shown in Fig. 5(f). Layer normalization is carried out on different channels of each sample, ensuring stable and smooth training of the network.
The existing activation functions can be moderately simplified with gated linear units (GLUs)[9] or Gaussian error linear units (GELUs)[10]:
We introduce a moderately simplified channel attention to enable the network to adaptively adjust the weights of each channel in the feature map. Using global spatial feature information for feature compression, the two-dimensional information on each channel is transformed into a real number, and the weights of different channels are further calculated through the fully connected layer. Channel attention can be seen as a special case of GLU,
3. Experimental Results and Discussion
3.1. Experiment
To validate the effectiveness of the proposed method, we need to obtain a data set consisting of high-low resolution sample data pairs. Therefore, we performed the experiment shown in Fig. 5, where we used edge illumination to illuminate the LIDs on the optical element[1].
We obtained third-party high-resolution images using a camera (MER-1810-21U3C, pixel resolution,
To measure the PSFs of the low-resolution imaging system, we used the nonparametric subpixel local PSF estimation method[12], which requires the acquisition of Bernoulli (0.5) random noise calibration pattern images. As shown in Fig. 5, we used the same camera with a low optical resolution lens (M1214-MP2 F# 1.4, focal length 12 mm), and we placed the Bernoulli calibration plate on the front surface of the actual optical element to measure the PSF of the imaging system.
To construct high-low resolution sample pairs, we cropped the acquired high-resolution images and obtained 550 high-resolution sample images with sizes of 64 × 64 pixels. It was convolved with the 45 measured PSFs separately according to Eq. (1), and a data set of
For obtaining low-resolution images for testing, we used LOG focus evaluation to find the best focus position near the 215 mm distance. We used the same low-resolution lens (whose PSF has been measured) and camera to acquire images at a lateral position aligned with the damage point.
3.2. Results and discussion
Figure 6 shows one of a set of measured PSF images. We trained both NAFNet and MMFDNet networks separately using the Loss1 function on the obtained damage data set and compared the reconstruction results. The experiments show that our proposed MMFDNet can reconstruct most of the detailed information and obtain high-resolution images with sharper edge information. It has a PSNR of 38.2017 dB (average value), which is 0.962 dB higher than that of NAFNet, and an SSIM of 0.983 (average value), which is 0.2% better than that of NAFNet. As shown in Fig. 7 and the data in Table 1, the output of MMFDNet obtained higher evaluation scores on the validation set images generated by the convolution of high-resolution images.
Figure 6.Measured PSFs in typical position (d = 255 mm).
NAFNet | MMFDNet (ours) | |||
---|---|---|---|---|
PSNR/dB | SSIM | PSNR/dB | SSIM | |
Input1 | 39.237 | 0.985 | 39.971 | 0.988 |
Input2 | 39.543 | 0.983 | 40.191 | 0.984 |
Input3 | 37.126 | 0.984 | 38.584 | 0.987 |
Input4 | 37.261 | 0.991 | 39.370 | 0.993 |
Input5 | 39.743 | 0.989 | 41.242 | 0.990 |
Input6 | 35.628 | 0.981 | 37.949 | 0.985 |
Input7 | 35.909 | 0.972 | 38.291 | 0.982 |
Table 1. Results of NAFNet and MMFDNet (ours) of Partial Samples
Figure 7.SR reconstruction results. The first column shows the low-resolution sample input images; the second column shows the high-resolution GT images; the third column shows the SR results of NAFNet; the fourth column shows the SR results of MMFDNet (ours). The data on the image are the PSNR (in dB)/SSIM values.
As shown in Eq. (2), the Loss1 function is a pixel-by-pixel comparison method. It has same weight for every pixel. This means that it devotes the same amount of attention to each pixel in the image,
However, we are more concerned with the damaged region rather than the black background parts. Therefore, we design a new loss function with weight for the dark-field image. It assigns a greater weight to the damaged region than the background region. As in Eq. (3), we use the threshold segmentation method to separate the damaged region from the background. An additional term of the average absolute value error of the damaged region with weight is added to enhance the weight on the damaged region, as follows:
To determine the damaged region
Figure 8.Result of the adaptive mean threshold segmentation that is used as the damage attention region.
Considering the damaged area is small, if
Figure 9.Search results for the ω. The black dashed line indicates the results of training using the Loss1 function on both PSNR and SSIM values. Other lines represent PSNR and SSIM values when ω takes different values.
Table 2 shows the results of the comparison. It shows that our proposed MMFDNet and
Loss1 | Lossregion | PSNR/dB | SSIM | |
---|---|---|---|---|
NAFNet | ✓ | 37.240 | 0.9809 | |
✓ | 37.497 | 0.9816 | ||
MMFDNet | ✓ | 38.202 | 0.9833 | |
✓ |
Table 2. Performance of the Networks and Loss Functions on the Data Set
Figure 10 shows the results for the images acquired with the low-resolution lens, and the results show that our method achieves better results.
Figure 10.Comparison of the SR results on low-resolution images acquired by actual cameras.
4. Conclusion
In summary, we propose a multichannel and multifrequency mixing deconvolution method that is suitable for the conditions of limited resolution, dark background, less texture, sparse and small lateral size, slight defocus and PSF error while maintaining better PSNR and SSIM values. The experimental results prove the effectiveness of the method.
References
[6] Y. Nan, H. Ji. Deep learning for handling kernel/model uncertainty in image deconvolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(2020).
[8] Z. Wang, X. Cun, J. Bao et al. Uformer: a general U-shaped transformer for image restoration. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(2022).
[9] L. Chen, X. Chu, X. Zhang et al. Simple baselines for image restoration. Computer Vision–ECCV 2022(2022).
[10] S. Nah, T. H. Kim, K. M. Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 257(2017).
[11] S.-J. Cho, S. W. Ji, J. P. Hong et al. Rethinking coarse-to-fine approach in single image deblurring. Proceedings of the IEEE/CVF International Conference on Computer Vision(2021).

Set citation alerts for the article
Please enter your email address