
Colored dissolved organic matter (CDOM) plays a pivotal role in the global carbon cycle and climate change. The rapid development of satellite remote sensing technology has provided a vast amount of ocean surface remote sensing data for oceanographic research, reflecting the internal state of the ocean to a certain extent. We combine multi-source ocean remote sensing data with deep learning techniques to propose a remote sensing inversion method for subsurface CDOM in the ocean. This method inverses the vertical distribution of subsurface CDOM by employing ocean surface remote sensing data, thus providing a new perspective and theoretical support for a deeper understanding of the mechanisms of the ocean carbon cycle and its interactions with climate change.
Firstly, the CDOM profile data obtained from BGC-Argo is preprocessed to address the uncertain vertical resolution. By conducting linear interpolation, the data is standardized to an interval of 1 m, ensuring consistency in depth between data points for subsequent analysis. Additionally, a low-pass filter is adopted to reduce peak fluctuations in the data, enhancing its smoothness and reliability. To address the missing ocean remote sensing data, we employ the inverse distance weighting (IDW) interpolation method, effectively filling in missing values in remote sensing images. The K-fold cross-validation method is utilized to evaluate the interpolation model, with the mean absolute percentage error (MAPE) selected as the evaluation metric. Given the spatial resolution mismatch between sea surface temperature (SST) data and remote sensing reflectance data, the bilinear interpolation algorithm is employed to reconstruct the resolution of the SST dataset, enhancing its resolution and ensuring spatio-temporal consistency of the model input data. Finally, based on the convolutional neural network (CNN) model, we design a subsurface CDOM inversion model for the ocean, adopting multi-band remote sensing reflectance, SST, and other parameters as inputs. This model consists of an input module, a CNN feature extraction module, and a prediction module, enabling the vertical distribution prediction of subsurface CDOM concentration in the ocean. As a result, the model’s applicability is evaluated via a test set and two independent test areas.
The filtered profile data of CDOM of the ocean exhibits smoother and more stable characteristics, effectively eliminating the interference of outliers on the overall data trend (Fig. 3). To achieve spatio-temporal consistency between BGC-Argo data and remote sensing reflectance data, we employ the IDW method to interpolate missing values in remote sensing reflectance images and validate the spatial interpolation model through K-fold cross-validation. By taking the Rrs443 remote sensing data from the first day of each month in 2020 as an example, the initial distribution of remote sensing data is shown in Fig. 4, while the reconstructed remote sensing data after IDW spatial interpolation is presented in Fig. 5. During cross-validation, the K value is set to 5, with the MAPE employed as the evaluation criterion. The results indicate that the overall error of the interpolation model remains below 30%, demonstrating the sound performance of the interpolation model. The proposed inversion model achieves a root mean square error (RMSE) of 0.14 μg/L, a correlation coefficient (r) of 0.73, and a coefficient of determination (R2) of 0.74 in the test set. Furthermore, in the validation of two independent test areas, the RMSE values are 0.13 μg/L and 0.18 μg/L respectively, with r values of 0.81 and 0.74, and R2 values of 0.79 and 0.69 respectively. By analyzing the vertical distribution plots of predicted and actual values for independent test zones A and B (Figs. 8 and 9), combined with the residual scatter plot between predicted and actual values (Fig. 10), it is evident that the predicted values are mostly concentrated around the y=x diagonal with the actual values. This result demonstrates a high degree of consistency between the model’s predictions and the measured CDOM distribution characteristics, thereby confirming the validity and applicability of the proposed model. The correlation between the distribution of CDOM and SST is explored via the subsurface CDOM-SST scatter plot (Fig. 11), which further validates the rationality of the inversion results.
We leverage multi-band ocean remote sensing spectral data (B1: Rrs412; B2: Rrs443; B3: Rrs490; B4: Rrs510; B5: Rrs560; B6: Rrs665), SST remote sensing data, and BGC-Argo data, combined with a CNN model, to develop an inversion model for the vertical distribution of marine subsurface CDOM in the Northwest Pacific region (131°E?180°E, 26°N?54°N). To validate the accuracy of this model, we evaluate the performance of this model by adopting a test set, proving the model’s sound performance. Additionally, to further verify the model’s applicability, we conduct predictions for the vertical distribution of CDOM in two independent test areas, which reveals a high degree of consistency between the predicted and measured CDOM distribution characteristics, thereby proving the model’s effectiveness in presenting the vertical distribution characteristics of marine subsurface CDOM. Meanwhile, an analysis of the vertical distribution characteristics of subsurface CDOM in the Northwest Pacific region is conducted by utilizing the constructed vertical distribution maps of CDOM in the independent test areas. Notably, the mass concentrations in spring and summer are significantly higher than those in autumn and winter, with CDOM mass concentrations gradually increasing with depth. As a crucial component of the oceanic carbon cycle, the distribution and variation of CDOM significantly influence this cycle. We not only uncover these key features of the vertical distribution of marine subsurface CDOM but also provide a solid theoretical foundation and support for its inversion, facilitating a deeper understanding and prediction of the dynamic changes in the oceanic carbon cycle. However, our study has certain limitations. For instance, the IDW remote sensing data reconstruction method based on spatial correlation can be further optimized by incorporating factors such as time series to enhance the model’s ability to capture dynamic temporal changes. Additionally, considerations can be given to adjusting the model structure, increasing network depth, and exploring the inclusion of additional remote sensing parameters such as sea surface elevation and wind speed to delve deeper into the complex relationship between ocean remote sensing data and the vertical distribution of marine subsurface CDOM and improve prediction accuracy.
.- Publication Date: Mar. 26, 2025
- Vol. 45, Issue 12, 1201001 (2025)
Urban areas contribute approximately 70% of global anthropogenic carbon dioxide (CO2) emissions, making them a key area in carbon monitoring efforts. The “top-down” approach, which uses measured atmospheric CO2 concentrations, allows for near real-time emission estimates on a global urban scale and serves as a crucial tool for verifying urban emission reductions. Currently, the prior estimates of urban CO2 fluxes in top-down assessments rely on data from open-source data inventory for anthropogenic CO2 (ODIAC) and vegetation photosynthesis and respiration model (VPRM). However, these prior fluxes possess high spatial uncertainty, resulting in significant bias in urban emission estimates and failing to meet the sub-kilometer resolution required for urban grids. In our study, we construct a high-resolution spatial and temporal dataset for urban CO2 fluxes by integrating multi-source data. We also evaluate the effect of this spatial optimization using column-averaged dry-air mole fraction of CO2 (XCO2) data from the orbiting carbon observatory-3 (OCO-3) satellite. The results indicate that using the optimized CO2 fluxes enables more accurate simulations of local CO2 concentration variations, achieving a closer match with observations. Our high-resolution urban CO2 flux dataset can contribute to reducing uncertainty in CO2 flux estimates and provide more accurate prior values for “top-down” urban emission estimates.
For CO2 fluxes, there are significant spatial dependencies. Anthropogenic emissions mainly come from fixed sources such as power plants, transportation networks, and industrial zones, while biogenic fluxes are concentrated in vegetation-covered areas like forests, croplands, and grasslands. To represent these spatial patterns, we use land cover types as proxies for CO2 fluxes. For anthropogenic CO2 emissions, we utilize datasets such as the global power plant database, OpenStreetMap, and the essential urban land use categories (EULUC), which offer detailed representations of emissions from power plants, industry, residential areas, and transportation networks. For biogenic CO2 fluxes, we select the WorldCover land cover dataset to distinguish key land cover types, including forests, croplands, and grasslands. The construction of CO2 flux grids involves specific methodologies for anthropogenic and biogenic fluxes. For anthropogenic emissions, we utilize sector-specific, grid-based emission data from the multi-resolution emission inventory for China (MEIC) and process spatial proxy data grid by grid to accurately allocate total emissions across geographic regions. For biogenic fluxes, we estimate flux factors for various vegetation types and integrate them with land use data to calculate precise flux values for each vegetation category. To validate the CO2 flux datasets, we adopt an indirect evaluation approach. We assess the accuracy of the constructed datasets by comparing observed and simulated CO2 concentrations. Simulations are carried out using the stochastic time-inverted Lagrangian transport (STILT) model, and the outputs are validated against XCO2 observations from the OCO-3 satellite. This approach provides a robust evaluation of the spatial representation of CO2 fluxes and their alignment with observed atmospheric CO2 distributions.
In our study, we take Hefei as a case study to develop a high-resolution urban CO2 flux grid with a spatial resolution of 0.002°×0.002° (Figs. 3 and 4). The constructed grid data effectively captures the detailed distribution characteristics of CO2 sources and sinks, which are not well represented in previous datasets. We compare the spatial patterns of the improved CO2 emissions with those from the MEIC and ODIAC datasets (Figs. 5 and 6). Additionally, we analyze the changes in biogenic CO2 fluxes before and after optimization using remote sensing imagery with a spatial resolution finer than 1 m (Fig. 8). To evaluate the effectiveness of the CO2 flux optimization, we employ the X-STILT model to simulate XCO2 concentrations based on both pre- and post-optimization CO2 flux data. These simulations are then validated against XCO2 observations from the OCO-3 satellite. The validation utilizes OCO-3 data from three observations: June 16, 2022 (Fig. 9), June 4, 2021 (Fig. 11), and October 11, 2020 (Fig. 12).
In the present study, we develop a high-resolution grid of urban anthropogenic and biogenic CO2 fluxes by integrating effective information from multi-source datasets with varying formats, spatial resolutions, and temporal coverage. We validate and evaluate the spatial optimization of CO2 fluxes using observational data from OCO-3. The analysis highlights pronounced local spatial heterogeneity in urban anthropogenic CO2 emissions. Strong point sources, such as power plants, and weaker sources, such as residential areas, lead to significantly different variations in local CO2 concentrations. Coarse-resolution emission data tend to average these differences in simulations, making it difficult to capture localized CO2 peaks. Compared to ODIAC data, the spatially optimized emission data substantially refine the urban CO2 emission distribution, transforming it from a “Gaussian-like” pattern to a “multi-centered” distribution. For biogenic CO2 fluxes, the optimized data successfully identify small-scale urban green spaces, enabling a more precise simulation of vegetation’s influence on local CO2 concentration dynamics. Using the WRF-XSTILT model, we compare simulations of XCO2 concentrations before and after optimization against OCO-3 observations. The results show significant improvements in both validation cases: correlation coefficients increase from 0.26 to 0.46, from 0.62 to 0.73, and from 0.50 to 0.60, respectively, while biases decrease from 1.36×10-6 to 1.24×10-6, from 0.87×10-6 to 0.80×10-6 and from 0.80×10-6 to 0.73×10-6. These findings underscore the enhanced capability of the optimized data to accurately represent the spatial distribution of CO2 fluxes.
.- Publication Date: Apr. 18, 2025
- Vol. 45, Issue 12, 1201005 (2025)
Carbon dioxide (CO2) is the most significant anthropogenic greenhouse gas in the atmosphere. Accurately assessing CO2 emissions is critical for developing effective and feasible reduction policies to mitigate global warming. Spaceborne platforms equipped with active and passive remote sensing instruments enable high-precision global column-averaged dry air mole fraction of CO2 (XCO2) observations, supporting the “top-down” approach to carbon emission estimation. Among these, spaceborne integrated path differential absorption (IPDA) lidar offers resilience to aerosol interference and, with its high pulse repetition frequency, can achieve global XCO2 observations with high temporal and spatial resolution. However, due to single observation errors, the data often need to be processed using the sliding average algorithm, which diminishes the high temporal and spatial resolution advantages of spaceborne IPDA lidar. Therefore, we propose using the Kalman smoothing algorithm to reconstruct the high temporal and spatial resolution lidar XCO2 observation from spaceborne IPDA data. Simulation experiments validate the algorithm’s filtering performance, and its application to point-source emission monitoring highlights its potential for high-resolution XCO2 monitoring. These findings underscore the significance of the Kalman smoothing algorithm in enhancing global carbon emission quantification using spaceborne IPDA lidar data.
Based on the high temporal and spatial resolution advantage of spaceborne IPDA lidar XCO2 data and its offline acquisition characteristics, we propose using the Kalman smoothing algorithm to reconstruct high temporal and spatial resolution XCO2 observation results. First, a pseudo-true value sequence is constructed based on XCO2 data simulated by weather research and forecasting model with greenhouse gases module (WRF-GHG). Various levels of observation errors are then superimposed on this sequence to create a pseudo-observation sequence. The filtering performance of the Kalman smoothing algorithm is tested with different state transfer matrices, and the optimal matrix is selected. Comparative experiments show that the Kalman smoothing algorithm outperforms the sliding average algorithm in terms of filtering performance. Finally, both the Kalman smoothing and sliding average algorithms are used to estimate the carbon emission rate of the same point source at the same time, confirming the Kalman smoothing algorithm’s applicability in high-resolution XCO2 monitoring.
Simulation experiments first determine the state transfer matrix for the Kalman smoothing algorithm, followed by a comparison of its filtering performance with the sliding average algorithm, which uses a spatial resolution of 50 km. The results show that the Kalman smoothing algorithm not only retains the original observation’s temporal and spatial resolution (0.05 s, 337.5 m), but also improves the mean absolute error (MAE) by 9.46%, reduces the root mean square error (RMSE) by 13.39%, and increases the correlation coefficient by 6.46%, compared to the sliding average algorithm with a temporal and spatial resolution of 7.4 s and 50 km. The monitoring capabilities of the Kalman smoothing algorithm and the sliding average algorithm for the same point source emissions are further compared. The XCO2 enhancement, obtained using the Kalman smoothing algorithm, estimates the point source emission rate at that moment to be 843.2 kg/s, with a correlation of 0.98 between the XCO2 enhancement and the Gaussian point source model simulation results. In contrast, the sliding average algorithm estimates the point source emission rate at that moment to be 1876.8 kg/s, with a lower correlation of 0.81 between the XCO2 enhancement and the Gaussian point source model simulation results. According to the emission inventory data for this point source, the annual average emission rate is 1100 kg/s. The instantaneous emission rate calculated by the Kalman smoothing algorithm is closer to this annual average, and the XCO2 enhancement shows a higher correlation. Therefore, it can be concluded that the Kalman smoothing algorithm offers superior point source emission monitoring capabilities compared to the sliding average algorithm.
In response to the demand for high temporal and spatial resolution in the application of XCO2 observation results from spaceborne IPDA lidar, we propose the use of the Kalman smoothing algorithm to process the original XCO2 data. We discuss the selection of the state transfer matrix in the Kalman smoothing algorithm and compare its filtering performance with that of the commonly used sliding average algorithm. The MAE between the Kalman smoothing algorithm’s filtering result and the true value is reduced by 9.46% compared to the sliding average algorithm, which has a temporal and spatial resolution of 7.4 s and 50 km. In addition, the RMSE is reduced by 13.39%, and the correlation coefficient is increased by 6.46%. Therefore, it’s concluded that the Kalman smoothing algorithm provides better filtering performance than the sliding average algorithm, which has a theoretical temporal and spatial resolution of 7.4 s and 50 km while retaining the original high temporal and spatial resolution (0.05 s, 337.5 m). The application of the Kalman smoothing algorithm in point source emission monitoring is also tested. The instantaneous emission rate calculated by the Kalman smoothing algorithm is closer to the annual average, and the XCO2 enhancement shows a higher correlation. Therefore, it’s shown that the Kalman smoothing algorithm can be effectively applied to high temporal and spatial resolution XCO2 observation scenarios. High-resolution XCO2 observations are crucial for assessing regional carbon sources and sinks, and the XCO2 observations reconstructed using the Kalman smoothing algorithm can provide vital data support.
.- Publication Date: May. 16, 2025
- Vol. 45, Issue 12, 1201007 (2025)
The atmospheric profile is a critical component in radiative transfer calculations, and constructing an atmospheric model that accurately reflects regional atmospheric conditions is essential to ensure the precision of these calculations. In this paper, we aim to explore atmospheric profile variations and improve the accuracy of radiative transfer calculations by proposing a novel method for constructing atmospheric models.
We analyze the vertical distribution and variation patterns of key atmospheric parameters, including temperature, water vapor, pressure, carbon dioxide, ozone, and methane. A new approach based on K-means clustering and random forest regression is developed to construct atmospheric profiles. Data sources include ERA5, WACCM, and CarbonTracker, covering historical atmospheric profile data over the past two decades. To address the resolution differences among these data sources, spatiotemporal interpolation, and height normalization methods are applied. We focus on the eastern region of China, where temperature, pressure, water vapor, and ozone profiles are clustered to reveal their seasonal and regional variation patterns. Subsequently, carbon dioxide and methane profiles are reconstructed using newly processed data.
The self-developed atmospheric model is compared with the 1976 US standard atmosphere using MODTRAN software to simulate spectral data. The simulated spectra are then compared with actual measurements from the FengYun satellite. The results show that the self-developed model improves simulation accuracy by 11.2% in January and 10.5% in July compared to 1976 US standard atmosphere model, indicating that the proposed model better approximates real atmospheric conditions (Fig. 5). This method offers a new approach for constructing atmospheric profiles for radiative transfer calculations.
The proposed method, which combines K-means clustering and random forest regression, significantly improves the accuracy of radiative transfer calculations by better capturing regional and seasonal variations in atmospheric profiles. This approach not only enhances the precision of radiative transfer simulations but also provides a valuable tool for atmospheric research and applications.
.- Publication Date: Apr. 27, 2025
- Vol. 45, Issue 12, 1201008 (2025)
In 1996, McFeeters proposed the normalized difference water index (NDWI), leveraging the unique reflectance characteristics of water bodies in remote sensing images, high reflectance in the green band and low reflectance in the near-infrared band. This index enables effective extraction of water bodies from remote sensing images and has become a classic and widely cited method in water body extraction, with thousands of references in academic research. While NDWI is widely applied to remote sensing images, its application to airborne LiDAR point cloud data remains limited. Compared to remote sensing image data, airborne LiDAR offers advantages such as high-precision laser point cloud data acquisition, independence from solar radiation, and greater operational flexibility. To address this gap, we propose a novel NDWI-LiDAR method that facilitates the rapid and accurate extraction of water body information by using only the elevation data from dual-frequency laser point clouds, overcoming the dependence on full waveform data.
In this paper, the proposed NDWI-LiDAR leverages the uncertainty and measurement bias of green lasers in water surface measurements and is based on the point clouds generated by airborne infrared and green lasers. The expression form of this index is similar to that of the NDWI, but the pixel values of the near-infrared and green bands in remote sensing images are replaced by the elevations of infrared and green laser points. First, the raw measurement data from the infrared and green lasers are used to calculate the positions of the laser footprints, resulting in infrared and green laser point clouds, respectively. Second, the expression for NDWI-LiDAR is provided based on the different characteristics of infrared and green lasers in water and land measurements. Third, a land?water discriminator utilizing the NDWI-LiDAR is introduced, with the Otsu method applied to establish the threshold for water extraction. Finally, the pulse numbers of adjacent laser points are analyzed to differentiate and eliminate noisy water points, thus obtaining the final water surface laser points and realizing accurate water body extraction from airborne laser point clouds (Fig. 5).
The measurement datasets collected by the Optech CZMIL system are used to validate the correctness and effectiveness of the proposed method. In the experimental area, the NDWI-LiDAR values for land tend toward 0 and negative, whereas those for water are positive. As shown in the NDWI-LiDAR probability density distribution image (Fig. 10), the land and water NDWI-LiDAR data exhibit distinct dual peaks: the peak NDWI-LiDAR density value for water is approximately 0.3, whereas that for land is approximately 0. Compared with the traditional random sample consensus (RANSAC) method, which is based on single-frequency laser point clouds, the NDWI-LiDAR method proposed in this paper reduces the number of incorrectly extracted water points by 86.7% (Fig. 12). Equations (12) and (13) are used to calculate the distance bias and structural similarity (SSIM) index of the land?water interface determined by the two methods. The maximum bias, mean bias, and standard deviation of the land?water interface determined by the NDWI-LiDAR are 25.2, 4.2, and 4.2 m, respectively, with an SSIM value of 0.92. In contrast, the maximum bias, mean bias, and standard deviation determined via the RANSAC method are 50.3, 8.8, and 6.7 m, respectively, with an SSIM value of 0.89 (Table 1).
In the experimental area, the NDWI-LiDAR values for land tended toward 0 and negative values, whereas those for water are positive. From the perspective of the NDWI-LiDAR probability density distribution, the values for land and water significantly differ. The peak NDWI-LiDAR density for water is approximately 0.3, whereas that for land is approximately 0. The results indicate that the NDWI-LiDAR values for land and water are significantly different, suggesting that it is reasonable to use NDWI-LiDAR as a LiDAR-based index for water extraction. Compared with the traditional RANSAC method, which relies on single-frequency laser point clouds, the NDWI-LiDAR method proposed in this paper reduces the number of incorrectly extracted water points by 86.7%, reduces the standard deviation of the land?water interface by 37.3%, and improves the SSIM index by 3.3%. The results demonstrate that the NDWI-LiDAR method effectively leverages the advantages of dual-frequency laser point clouds, thus enabling accurate and efficient acquisition of spatial distribution information for water bodies based on LiDAR point clouds.
.- Publication Date: Mar. 26, 2025
- Vol. 45, Issue 12, 1228003 (2025)
Building change detection has caught wide attention as an important research direction with the continuous progress made in change detection technology of remote sensing images. Accurate building change detection is crucial for land utilization assessment, urban development monitoring, and disaster damage assessment. Although traditional change detection methods can provide some assistance for building change detection, they usually rely on spectral information or simple pixel-level differences and have certain limitations, especially when dealing with high-resolution remote sensing images of complex scenes with low accuracy. With the rise of deep learning, especially convolutional neural networks (CNNs), the change detection tasks of remote sensing images have been significantly improved. However, methods based on CNNs usually employ simple fusion operations as the last step of the detection results and fail to pay sufficient attention to effective change information extraction. Additionally, existing feature extraction methods tend to ignore the feature interactions between two spatiotemporal images and usually focus only on features at isolated time points, which restricts the ability to capture the change information and fails to recognize the dynamic feature interactions between two spatiotemporal images. When high-resolution remote sensing images still face shortcomings such as complex spatial features and much scale information, especially during extracting the relationship between the target of interest and other targets in the changing region, the Transformer-based method also cannot fully capture the long-distance dependency between different areas, resulting in limited performance improvement. To this end, we propose a new method for change detection in high-resolution remote sensing images based on spatiotemporal fusion and SFMRNet.
The proposed SFMRNet employs an encoder-decoder architecture, where a two-branch weight-sharing encoder processes the dual time-phase images, feature extraction is carried out in each branch by adopting ResNet 18, and a feature exchange module (FEM) is utilized to efficiently extract the key information related to building changes after the stage 1 and stage 3 of ResNet 18. The extracted dual time-phase features from each layer are processed by the spatiotemporal fusion module (STM) to capture important information between different temporal features. The fused output is further fed into the multi-feature relationship module (MFA), which leverages self-attention and cross-attention mechanisms to capture intra-class relationships and parse the interaction information between the changing region and the environment respectively. Next, the multi-layer perceptron (MLP) is adopted to optimize the global information related to the channels in the feature map and generate the attention map. During the decoding, the attention map is restored to its original spatial resolution by up-sampling step by step to reduce the spatial information lost from deep features and ensure full utilization of multi-scale information. Finally, the difference feature maps restored to their original size are processed by a pixel classifier to generate the final change prediction map.
We conduct experiments on two public datasets (LEVIR-CD and WHU-CD) to validate the model’s effectiveness. The results show that SFMRNet achieves 91.54%, 90.32%, 81.54%, and 89.80% on the WHU-CD dataset for the precision (Pr),
We propose a remote sensing change detection network that integrates time-domain fusion and multi-feature relationships. The network employs the FEM to enhance feature interactions between dual-temporal images and filter out irrelevant information, thereby improving building change detection. The STM dynamically identifies important features by fusing temporal information, thus enhancing the integration of dual-temporal features and ensuring key information retention. Additionally, the MFA utilizes self-attention and cross-attention mechanisms to capture the varying levels of intrinsic relationships between the features, which enhances the segmentation accuracy of changing regions. We validate the superiority of SFMRNet via qualitative and quantitative comparisons across multiple remote-sensing image datasets. Ablation experiments further confirm the contribution of each module to overall performance, demonstrating SFMRNet’s capability to capture subtle change information and reduce background noise interference. These results indicate that SFMRNet provides an innovative and efficient solution for change detection, thereby facilitating performance improvement in practical applications.
.- Publication Date: Jun. 13, 2025
- Vol. 45, Issue 12, 1228005 (2025)
With the rapid advancement of automobile intelligence, the demand for high-precision object detection of road obstacles in autonomous driving continues to grow to ensure driving safety. However, existing object detection methods based on lidar point clouds face significant challenges. For instance, direct point cloud processing consumes substantial computational resources, voxel-based methods still have high computational costs, and approaches combining point clouds with visual images encounter complex data fusion challenges. While point cloud projection methods simplify data representation and reduce computational demand, they suffer from issues such as information loss and feature fusion difficulties. Consequently, simplifying point cloud representation, reducing computational overhead, and improving detection precision have become pressing challenges. To address these issues, we propose a multi-view fusion object detection method based on lidar point cloud projection.
In this paper, we propose a multi-view fusion object detection method to address three-dimensional (3D) point cloud detection tasks. The system architecture is shown in Fig. 1. Specifically, to achieve dimensionality reduction, the 3D point clouds are first projected onto a plane to generate a two-dimensional (2D) bird’s eye view (BEV). Simultaneously, the 3D point clouds are converted into cylindrical coordinates, and the cylindrical surface is unfolded into a rectangle to create a 2D range view (RV). Projection views are encoded into multi-channel images from the point cloud data. These images serve as input to the object detection network. In addition, the efficient channel attention (ECA) mechanism is incorporated into the Complex-YOLO and YOLOv5s networks, which are employed as the object detection networks for BEV and RV, respectively. The preliminary object detection results from both views are then fused at the decision-making level using weighted Dempster-Shafer (D-S) evidence theory, resulting in the final detection outputs.
In scenarios without occlusion, the proposed method achieves slightly lower detection precision for pedestrians and cyclists compared to BEVDetNet and sparse-to-dense (STD), respectively. Specifically, the precision for detecting pedestrians is 1.22 percentage points lower than BEVDetNet (Table 2), and the precision for detecting cyclists is 0.40 percentage points lower than STD (Table 3). However, under occlusion conditions, the method significantly improves detection precision by 1‒5 percentage points. This improvement is primarily due to the integration of information from both views, which compensates for occlusion effects. When detecting cars (Table 1), the physical shapes of cars in the two different views are relatively regular, making their features easier to extract. After fusing the object detection results from the two views, precision under the three occlusion levels improves to varying degrees. Compared with STD, a method with relatively strong detection performance, the precision is improved by 0.52 percentage points under the easy level, 2.04 percentage points under the moderate level, and 1.25 percentage points under the hard level. These performance indicators demonstrate significant improvement, particularly in cases of occlusion. Using the object detection method proposed in this paper, average AP values achieved are 81.37% for cars, 49.34% for pedestrians, and 67.97% for cyclists. In ablation experiments (Table 5), compared with the original single-view object detections for BEV and RV, the average precision (AP) for cars increases by 4.70 and 6.16 percentage points, respectively. For pedestrians, AP increases by 3.44 and 2.73 percentage points, and for cyclists, by 4.06 and 3.63 percentage points. Overall, the mean average precision (mAP) improves by 4.07 and 4.18 percentage points for BEV and RV, respectively. In addition, visualization results demonstrate that the proposed method effectively reduces false detections (Fig. 7) and missed detections (Fig. 8).
To address missed and false detections in single-view lidar point cloud projection methods, we propose a multi-view fusion object detection method. The point clouds are projected into BEV and RV views and encoded into three-channel images. The ECA module is integrated into Complex-YOLOv4 and YOLOv5s networks, which are used as detection models for BEV and RV, respectively, to generate preliminary results. These results are then fused using weighted D-S evidence theory to produce the final detection outputs. Compared to single-view methods, the mAP is improved by 4.07 and 4.18 percentage points for BEV and RV, respectively. By reducing 3D point clouds to 2D images, our method significantly reduces computational complexity. It also combines information from multiple views to overcome challenges such as occlusion and feature extraction limitations. Future research will focus on balancing precision and recall in point cloud object detection, refine objective functions, and further enhance detection performance.
.- Publication Date: May. 16, 2025
- Vol. 45, Issue 12, 1228006 (2025)
Coherent Doppler wind lidar (CDWL) has become an essential tool for wind velocity measurement in various fields, including wind resource assessment, aviation safety, and meteorological research. In applications like turbulence monitoring and aircraft wake vortex detection, where fine-scale wind field analysis is crucial, enhanced range resolution and improved velocity measurement precision are required. Traditional pulsed CDWL systems employ short pulses to achieve high-range resolution. However, shorter pulses compromise frequency resolution, leading to a decline in wind velocity measurement precision. Phase-coded modulation schemes offer a potential solution by decoupling range and frequency resolutions. However, in these schemes, the pulse width is typically constrained by spread spectrum crosstalk if the modulation format is not appropriately selected. To overcome these limitations, we propose a novel long-pulsed CDWL system based on minimum shift keying (MSK) modulation. Due to the effective crosstalk suppression of MSK signals, the advantages of a longer coding sequence are fully utilized. Consequently, the range resolution is determined by the chip duration, while the extended pulse duration ensures high-frequency resolution and signal-to-noise ratio, contributing to precise wind velocity measurement.
We employ an all-fiber coherent receiving architecture. The signal beam is frequency-shifted and gated by an acousto-optic modulator (AOM) and subsequently encoded by an I/Q electro-optic modulator for MSK modulation. The amplified probe pulse is transmitted into the atmosphere via an optical antenna. The Mie backscattering from aerosols is received by the same antenna and then coherently detected. Through digital signal processing, the radial wind velocity at various ranges is finally retrieved from the Doppler frequency shift. Phase or frequency coding modulation leads to spectral spreading. Therefore, in the decoding process, the scattered signal is despread when multiplied by a decoding sequence with different time delays. Based on this architecture, theoretical analysis, simulations, and experiments are conducted. The crosstalk suppression performance of MSK modulation is first explained through theoretical evaluation. Subsequent simulations are conducted based on available experimental conditions, comparing the MSK scheme with non-coded and classical binary phase shift keying (BPSK) schemes. The clearer spectral peaks and higher precision in wind velocity estimation further demonstrate the low crosstalk characteristic of MSK-modulated signals. In the experiments, a comparative measurement is conducted between a 63-bit MSK-coded pulse and a non-coded pulse with a duration of 300 ns to validate the effectiveness of the MSK coding scheme. Additionally, the MSK scheme is compared with the BPSK scheme under the same conditions to prove the superior performance of the MSK scheme.
The simulation results demonstrate that the MSK-modulated pulse offers better frequency estimation performance than the other two pulses due to its effective crosstalk suppression (Fig. 4). To evaluate the precision and accuracy of wind velocity estimation, the standard deviation (SD) and root mean square error (RMSE) are calculated for different modulation schemes. As a result, the MSK-modulated scheme not only has an advantage in range resolution over the 300 ns non-coded pulse, but also achieves higher wind velocity estimation precision, accuracy, and a longer reliable detection range compared with both the non-coded pulse and the BPSK modulation (Fig. 5). In the experimental measurements, the superiority of MSK modulation is further demonstrated. Compared to the BPSK-modulated pulse, the MSK-modulated pulse provides more stable wind velocity estimates in regions with significant velocity variation, which results in smaller velocity SD across multiple measurements (Fig. 8). Especially, 3 m range resolution and 0.20 m/s wind velocity precision within a 450 m detection range are achieved in the MSK modulation scheme, using a pulse peak power of only 20 W. Despite the promising results, there is still room for improvement in the current system. The reflection of the optical antenna directly causes a detection blind zone because of the deployment of a monostatic transceiver configuration. Therefore, in applications where the blind zone needs to be minimized, a bistatic system with separate antennas for transmission and reception should be considered. Furthermore, future work will aim to optimize the telescope diameter and receiver efficiency to extend the detection range.
We introduce a novel MSK-modulated CDWL system that effectively resolves the trade-off between range and frequency resolutions of pulsed CDWL. Due to its crosstalk suppression performance, a longer pulse duration can be applied to wind velocity measurements. Therefore, the signal-to-noise ratio gain provided by the long-coded pulse reduces the reliance on high peak power in pulsed CDWL systems. Both simulation and experimental results consistently show that the MSK scheme, with its superior crosstalk suppression, outperforms BPSK in terms of wind velocity measurement precision and detection range under the same conditions. Moreover, thanks to its phase continuity, the proposed scheme requires a lower bandwidth, which allows simplification of the CDWL system architecture. Given an optimized peak power and optical antenna telescope size, MSK modulation can fully exploit its potential at extended detection ranges, offering a promising approach to enhancing range resolution and velocity measurement precision in pulsed CDWL systems.
.- Publication Date: Jun. 13, 2025
- Vol. 45, Issue 12, 1228007 (2025)
As one of the most important components of the atmosphere, ozone is typically categorized into stratospheric and tropospheric ozone. Tropospheric ozone accounts for only 10% of the total ozone content, but its pollution is a significant threat to human health. In 2020, the institute for health metrics and evaluation (IHME) identified environmental ozone as a level 3 risk to human health, linking it to chronic obstructive pulmonary disease (COPD) and premature death. Ozone is not only influenced by its photochemical precursors but also by meteorological factors, pollution transport, and stratospheric ozone. Since the 1970s, differential absorption lidar (DIAL) technology has been widely used for remote sensing of tropospheric ozone concentrations with high spatial and temporal resolution. Early DIAL systems mostly use complex dye lasers, require frequent maintenance, and have poor frequency stability and short lifespans. Nowadays, many ozone lidar systems employ fixed-frequency laser sources such as gas-stimulated Raman lasers. However, the large size and poor thermal conductivity of these devices limit their flexible application in high repetition frequency pumping lasers. To reduce instability caused by tunable light sources and miniaturize the system, an all-solid-state tunable Raman laser is used as the emission source, resulting in a compact ozone DIAL system suitable for multi-platform observation.
This ozone lidar system uses a 532 nm solid-state laser with a high repetition rate as the pump source. It generates a Raman frequency shift using a SrWO4 Raman crystal, producing a first-order Stokes laser at 560 nm and a second-order Stokes laser at 590 nm. The system then doubles the frequency using a BaB2O4 (BBO) crystal. The 590 nm optical path uses a half-wave plate to adjust the polarization, producing a dual violet output at 280 nm and 295 nm. Two high-damage dichroic mirrors are used to separate visible and ultraviolet light. Both ultraviolet beams have a divergence angle of less than 0.35 mrad, as confirmed through testing. As shown in Fig. 1(a), optical components such as the Raman and frequency-doubling crystals are tightly mounted on the optical platform, ensuring the compactness and stability of the optical path. The Cassegrain-type receiving telescope system is compact in both size and structure, with an aperture of about 150 mm, further reducing the overall size of the lidar system. To validate the accuracy of the DIAL’s vertical detection data, validation experiments are carried out.
A thermo fisher model 49i ozone analyzer is installed at a horizontal distance of about 800 m from the lidar, with the lidar mounted on the pylon at an approximate horizontal angle towards the ozone analyzer. The data from both the lidar and the ozone analyzer are processed to calculate the average ozone concentration per hour, excluding data from precipitation and instrument maintenance periods. The inversion results for the lidar’s detection at about 800 m are compared with those of the ozone analyzer. As shown in Fig. 3, the lidar and ozone analyzer data exhibit good consistency over time. The DIAL measurements are about 11 μg/m3 lower than those of the ozone analyzer. This deviation is primarily due to the height difference of about 100 m between the lidar and the ozone analyzer. In clear weather, as solar radiation increases in the morning, ozone generation on the ground is enhanced, and the ozone is transported upward. Before the photochemical reaction diminishes, the ground serves as an ozone source, leading to slightly higher ozone concentration at altitudes up to 100 m. The detection data of the two devices are linearly fitted, and the correlation coefficient reaches 0.888. Then, a sounding balloon is launched at the meteorological bureau of Baoshan District, Shanghai, and its data are compared with those from the lidar at the same location and time. The experiment includes four time nodes: 8:00 AM, 1:00 PM, 6:00 PM, and midnight. Fig. 7 shows the ozone concentration profile from near the ground to an altitude of 3 km as detected by both the lidar and the sounding balloon. The results demonstrate that the mean deviation of ozone concentration within 3 km is less than 7.9 μg/m3, with a correlation coefficient of 0.857. This confirms the reliability of vertical detection of DIAL.
During the Spring Festival period, the ozone concentration is higher than that during non-festival time due to the effects of fireworks and firecrackers. In addition to local photochemical generation, external transport from western regions significantly affects the diurnal variation of ozone. An airborne vehicle lidar experiment conducted in Zhejiang Province shows that high ozone values are concentrated near 600 m. The source of the high ozone concentrations is traced. Throughout the observation period, the ozone lidar system, equipped with a solid Raman light source, operates reliably, providing accurate monitoring data that capture fluctuations in environmental ozone levels and identify ozone concentration hotspots. This system offers a new technical means for the detection of spatial and temporal distribution of regional atmospheric ozone.
.- Publication Date: May. 16, 2025
- Vol. 45, Issue 12, 1228008 (2025)
To meet the low polarization sensitivity requirements of space-borne multi-channel imaging spectrometers for atmospheric environment detection, and to overcome the shortcomings of traditional wedge crystal depolarizers which degrade instrument imaging quality, we design a multi-channel depolarizer based on the elasto-optical effect without causing image quality loss. The depolarizer overcomes the limitations of new liquid crystal and metasurface depolarizers such as narrow band range, low transmittance, and complex preparation. Based on existing research on photoelastic modulators, the complete theoretical analysis formula is derived for missile optical depolarizers, along with an examination of the influencing factors on the depolarizing effect and the optimal depolarizing conditions. To meet the multi-channel detection requirements of atmospheric environment detection imaging spectrometers, we propose a multi-channel depolarization method. This method uses the driving term to compensate for the delayed dispersion, which addresses the inherent wavelength dependence of the time-type photoelastic depolarizer and ensures that the residual polarization of each channel of the atmospheric detection imaging spectrometer is less than 2%.
In this paper, the complete theoretical calculation formula for the photoelastic depolarizer is derived from the Mueller matrix and the Stokes vector. The degree of polarization is analyzed with respect to key factors, such as the frequency of the photoelastic modulator, the peak delay, the polarization angle of the incident light, the integration time, and the angle between the optical axes of the two photoelastic modulators. The peak delay of the photoelastic modulator is 2.405 rad when the optimal depolarization is achieved. A method for compensating the delayed dispersion of the photoelastic modulator using the driving term is proposed, based on the relationship between the depolarization spectrum width, central wavelength, and residual polarization degree. This method can effectively overcome the inherent wavelength dependence of the photoelastic modulator, thereby enabling multi-channel simultaneous and efficient depolarization of the spaceborne atmospheric detection imaging spectrometer, which lays a theoretical foundation for improving the accuracy of atmospheric parameter inversion.
A single photoelastic modulator cannot effectively depolarize linearly polarized light in all directions. The dual photoelastic modulator structure can achieve omnidirectional depolarization, but the optical axes of the two modulators need to be placed at 45° to avoid the phenomenon where the residual degree of polarization oscillates with the integration time (Fig. 6). Theoretical calculations show that the best depolarization effect can be achieved under multiple peak delays. Selecting the first peak delay of 2.405 rad can minimize the peak-to-peak value of the driving voltage and reduce the difficulty of circuit design (Fig. 3). The peak delay of the photoelastic modulator is greatly affected by the driving circuit. Under the existing circuit stability conditions, a delay deviation of 0.01 rad will lead to a 0.5% decrease in depolarization (Table 2). The influence of incident light polarization angle and integration time is relatively small and easy to control. In this paper, the relationship is established between the depolarization spectrum width, center wavelength, and residual polarization degree. In addition, based on the establishment of this relationship, a method using the driving term to compensate for the delayed dispersion of the photoelastic modulator is proposed to achieve multi-channel depolarization of the photoelastic depolarizer. However, the maximum depolarization degree that each channel can achieve is limited by the channel bandwidth (Fig. 8). Therefore, it is necessary to divide channels with wide bandwidths into two or more segments for depolarization. Finally, we design a photoelastic depolarizer, which allows each channel of the four-channel imaging spectrometer to achieve a depolarization degree of more than 98% under the adjustment of the peak-to-peak value of the five driving voltages (Tables 3 and 4).
To solve the inherent limitations of traditional wedge crystal depolarizers and time-type depolarizers, we propose a method for realizing multi-channel depolarization by extending the theoretical formula for missile optical depolarizers and analyzing the influence of key parameters. The dual-elastic-optic modulator structure can effectively realize the depolarization of omnidirectional linearly polarized light, and selecting the first peak delay can reduce the difficulty of designing the driving circuit. A multi-channel depolarization technique is also proposed, which uses the driving term to compensate for delayed dispersion. By adjusting the five peak-to-peak voltage values to drive the photoelastic depolarizer, the depolarization degree of each channel in the four-channel spaceborne atmospheric detection imaging spectrometer can exceed 98%. The depolarizer offers advantages such as minimal image quality loss and autonomous switching between application channels, which makes it highly promising for future applications. Before practical implementation, however, it is necessary to consider the influence of environmental factors, calibration and installation errors, and driving circuit stability on the depolarizer’s performance, to enable the engineering application of the photoelastic depolarizer in spaceborne imaging spectrometers.
.- Publication Date: Jun. 18, 2025
- Vol. 45, Issue 12, 1228010 (2025)
Coastal and estuarine environments often present complex optical conditions due to high turbidity, strong riverine influence, and diverse phytoplankton assemblages. Remote sensing reflectance (Rrs) measured from above the water’s surface is crucial for characterizing these waters and retrieving key bio-optical variables, such as suspended particulate matter (SPM) and chlorophyll-a (Chl-a). However, the accuracy and stability of Rrs retrievals can be hindered by various factors, such as skylight reflection, sun glint, whitecaps, and fluctuations in environmental conditions like wind speed and viewing geometry. In this study, we aim to investigate the performance of shipborne apparent optical properties observation system (AOP-Cruise) in the Yangtze River estuary and adjacent waters, and conduct a systematic comparison of four commonly used on-water spectral correction methods (RSOA, G01, M99, and J20) across varying water types, wind speeds, and observation angles.
Field observations were carried out in July 2023 in the Yangtze River estuary and adjacent coastal waters, covering salinities from 16 to 31 psu and a wide range of turbidity levels. Thirty-two stations are sampled, and over 2000 hyperspectral measurements are obtained during daytime cruises using the AOP-Cruise system. The system continuously measures three above-surface radiometric quantities [Lt(λ), Ls(λ), and Es(λ)] with high spectral resolution. Before deriving Rrs(λ), the measurements are interpolated onto a 1 nm grid from 320 nm to 950 nm. Four spectral correction methods are applied: M99 (fixed reflectance ρ≈0.028 and near-infrared residual correction), G01 (ρ≈0.021 with specific near-infrared channels), J20 (residual skylight removal near 810 nm), and RSOA (a spectral optimization approach modeling ρ(λ) and minimizing residual biases). After spectral calibration, quality checks (e.g., filtering out high sun zenith angles), and Savitzky-Golay smoothing, the resulting Rrs spectra from each method are used in empirical single-band (555 nm) and fluorescence-based retrievals of SPM and Chl-a, respectively. The SPM and Chl-a measurements taken at each station are used for model validation and comparison.
Over 80% of the processed Rrs data have high spectral quality scores, which demonstrates that the four correction methods yield plausible Rrs under favorable conditions (wind speed <5 m·s-1, viewing azimuth near 135°, and solar zenith angle <60°). Differences emerge in the blue-green region (412?555 nm), where G01 and J20 tend to overestimate Rrs, whereas M99 exhibits a closer alignment with RSOA. The linear comparison with RSOA indicates that G01 and J20 have higher slopes (~1.18 and ~1.17, respectively), while M99 has a slope near unity and a lower mean absolute percentage deviation (~27%). When the wind speed exceeds 5 m·s-1 or the viewing azimuth deviates more than ±10° from 135°, the derived Rrs display larger variances due to increased surface roughness, whitecaps, and greater sky-glint contamination. Under these conditions, RSOA’s adaptive spectral approach and M99’s near-infrared correction remain relatively robust, while G01 and J20 show more pronounced biases. Retrievals of SPM and Chl-a from the four methods show a good correlation with in situ measurements (mean R2 around 0.74?0.77 for SPM and ~0.75 for Chl-a), though higher SPM (>20 mg·L?1) introduce larger scatter, with G01 and J20 frequently overestimating and M99 slightly underestimating. For Chl-a, all four approaches are relatively consistent across low-to-moderate concentrations. The spatial distributions of SPM and Chl-a show nearshore maxima and offshore decreases, which highlights both natural gradients and method-dependent differences. The analysis by water type (clear vs. turbid) indicates that RSOA achieves lower variability in clearer waters, whereas M99 performs better in more turbid areas, which reflects each method’s sensitivity to wind speed, geometry, and water optical properties.
In summary, we confirm the applicability of RSOA, G01, M99, and J20 for on-water spectral correction and subsequent SPM/Chl-a retrievals in the Yangtze River estuary and adjacent regions. Under near-ideal conditions, all methods produce consistent Rrs results. However, G01 and J20 tend to overestimate in the blue-green domain, while M99 aligns more closely with RSOA. Higher wind speeds or non-standard viewing angles exacerbate the differences between methods, which highlights the complexities of skylight reflection and whitecap effects. While all methods demonstrate good overall performance in retrieving SPM and Chl-a, RSOA generally provides greater stability in clear waters under moderate wind conditions, whereas M99 shows stronger robustness in higher turbidity or wind speeds. G01 and J20 are found to be sensitive to geometric or surface perturbations. We underscore the potential of the domestic AOP-Cruise system for real-time hyperspectral observations and stress the importance of choosing suitable correction methods based on local water types and environmental conditions. Future efforts could focus on refining site-specific calibration or extending the comparison to a wider range of coastal and inland water environments to further improve measurement reliability and accuracy.
.- Publication Date: Jun. 18, 2025
- Vol. 45, Issue 12, 1230001 (2025)