
- Chinese Optics Letters
- Vol. 23, Issue 5, 050601 (2025)
Abstract
1. Introduction
According to Cisco’s 2020 Internet Report, the number of Internet users grew at a compound annual growth rate of around 6% from 2018 to 2023, indicating that future networks will generate even more traffic, thereby making efficient data transmission essential. Coherent optical communication has demonstrated significant advantages in tackling this issue. To enhance spectral efficiency, it is a common approach to increase signal modulation cardinality, with ultrahigh-order signals validated in probabilistic-shaping (PS) experiments involving 1024/4096-quadrature amplitude modulation (QAM)[1,2]. However, the transmission of ultrahigh-order signals often relies on complex hardware and high transmission costs, which are challenging for wide applications. Thus, advanced digital signal processing (DSP) algorithms are becoming indispensable to compensate for signal impairments.
For ultrahigh-order QAM signals, it is significant to effectively equalize and alleviate various linear and nonlinear noises; nonlinear compensation (NLC) represents one of the most formidable challenges[3]. The Kerr nonlinearity of fiber establishes a fundamental limit on the information capacity of fiber communications as the nonlinear Shannon limit, particularly because of its interaction with the amplified spontaneous emission noise from optical amplifiers. Meanwhile, as the number of constellation points increases in ultrahigh-order QAM signals, the Euclidean distance between them diminishes significantly, and thus high-order signals are more susceptible to noise and linear/nonlinear impairments. Consequently, these ultrahigh-order signals require a higher optical signal-to-noise ratio (OSNR).
Main nonlinear compensation techniques include digital backpropagation (DBP)[3], Volterra series-based nonlinear equalizers (VNLEs)[4], and phase-conjugated twin waves (PCTWs)[5]. DBP addresses impairments by solving the inverse propagation nonlinear Schrödinger equation (NLSE), but its high complexity limits commercial viability. VNLEs employ Volterra series transfer functions for modeling, which demands considerable computational resources and struggles with high-order nonlinear effects. PCTWs mitigate the first-order nonlinear effects through digital coherent superposition, but they halve spectral efficiency.
Sign up for Chinese Optics Letters TOC. Get the latest issue of Chinese Optics Letters delivered right to you!Sign up now
Neural networks (NNs), leveraging their powerful fitting and analytical capabilities, are well suited for nonlinear equalization in fiber channels. Recently, there has been a growing interest in employing machine-learning (ML)[6] techniques to address nonlinear impairments in coherent optical communications. Artificial neural networks (ANNs) for 16-QAM[7], convolutional neural networks (CNNs) for 16-QAM[8], and recurrent neural networks (RNNs)[9] for 64-QAM have all been effectively utilized in coherent optical communication. Moreover, experiments have demonstrated that long short-term memory (LSTM) networks can successfully mitigate impairments in 16-QAM coherent optical transmission systems[10,11], highlighting the promise of LSTM-based postprocessing for handling such challenges. In previous studies, NNs have primarily been applied to lower-order modulation signals, where additional linear equalizers were usually indispensable prior to the NN structure. Ultrahigh-order signals, however, pose significant challenges due to the sharp reduction in Euclidean distance between constellation points, making them extremely sensitive to both linear and nonlinear impairments. Even minor distortions can cause rapid signal degradation, and there has been limited research into the use of NNs for such high-order signals.
As a result, it is challenging to apply conventional NNs for ultrahigh-order signals in coherent optical transmission systems. In this Letter, we introduce a temporal feature-based memory (TFM) NN equalizer for degradation compensation in ultrahigh-order coherent optical transmission. Experiments of 1024/4096-QAM signals have been demonstrated to verify the outstanding performance of the proposed TFM NN equalizer over Volterra series-based nonlinear equalizers. The raw spectral efficiencies of 16.190 and 21.188 bit/s/Hz have been achieved for 1024-QAM and 4096-QAM signals at the normalized generalized mutual information (NGMI) thresholds of 0.881 (20%) and 0.778 (25%), respectively.
2. Principle of the Method
Before delving into the NN architectures, it is essential to clarify the dataset utilized. Signal equalization specifically addresses interference between current symbols and adjacent symbols. The data set consists of multiple sets of complete signals acquired under varying optical signal-to-noise ratio scenarios. The NN uses four features of the signal, i.e., each I and Q component of the X/Y polarization signal. Each signal set comprises the real and imaginary components of the X/Y polarization signal over 10,000 symbols, capturing information from the current symbol and its surrounding context. The training process is conducted offline to ensure computational efficiency during operational phases. To avoid potential bias towards specific pseudo-random binary sequence (PRBS) characteristics, training tests include different PRBS sequences and an early stopping mechanism is incorporated based on mean squared error (MSE) validation results every 25 epochs. Training terminates if MSE stagnates after 250 epochs. All input data undergo zero-mean normalization before entering the network. Here, NNs use MSE as the loss function for regression tasks, which has proven effective in signal processing[10]. The Adam optimizer is employed, starting with an initial learning rate of 0.008, which is reduced to 0.1 every 500 epochs. The training spans 1000 epochs with early stopping. To ensure robustness, the data used for subsequent analysis and plotting are independent of the training set data and generated using different PRBS sequences.
The architecture of the proposed TFM NN is depicted in Fig. 1. A temporal convolutional network (TCN) is a network layer of TFM NN, especially designed for processing time-series data, which enables convolutional networks to efficiently process sequence data through unique structural adjustments, so that data features can be better extracted. Initially, the signals pass through the TCN layer for feature extraction, which alters the feature dimensions with a constant sequence length. The TFM NN’s architecture is designed to optimize feature extraction for the bidirectional long and short-term memory NN (Bi-LSTM) layer while balancing complexity and retaining the key characteristics of ultrahigh-order signals. The enriched feature set enables the following Bi-LSTM to model complex dependencies and interactions among the signals, which can enhance the nonlinear compensation performance.
Figure 1.Architecture of temporal feature-based memory network diagram.
Each residual block in the TCN module integrates two inflationary convolutional layers, expanding the receptive field of the kernel without increasing the parameter count. Such a setup allows each convolutional output to capture a broader range of information. After that, the layer normalization stabilizes inputs at each layer, thereby accelerating training and enhancing generalization. A tanh activation function further speeds convergence and optimizes weight updates, while a dropout layer at the module’s end mitigates overfitting to increase the model’s robustness and homogeneity. The tanh activation function is formulated as
The TCN offers enhanced feature extraction by stacking multiple convolutional layers, which exploits filters with varying sizes to capture temporal features across different scales. This multiscale approach allows the TCN to be particularly sensitive to local dependencies in sequential data. Therefore, the TCN can improve nonlinear compensation performance, especially for ultrahigh-order signals, where it is essential to capture intricate timescale dependencies. In the TCN as shown in Fig. 1, dilated convolutions are first utilized to increase the receptive field. Suppose the one-dimensional input sequence and the filter set . The dilated convolution operation for a sequence element is defined as
Figure 2.Schematic of TCN.
The receptive field of a TCN is determined by network depth , filter size , and dilation factor [12]. To ensure stability in deeper TCNs and optimize feature extraction, residual layers are employed, consisting of two causal convolutional layers followed by nonlinear activation layers. Layer normalization of postconvolution is applied to maintain stability, while dropout layers help mitigate overfitting. To match input and output dimensions, a convolution is used for proper tensor alignment before element-wise addition. Each convolutional layer in the residual blocks incorporates weight normalization to prevent gradient explosion. To enhance generalization and reduce overfitting, L2 regularization is included in the TCN, constraining the L2 norm of the parameters[13]. In this work, each residual block comprises two convolutional layers with a stride of one. The residual network architecture incorporates shortcut connections, where certain layers can be bypassed to pass the original data directly to the next layer. These additional connections do not increase the model’s complexity.
By exploiting TCN layers for feature extraction, Bi-LSTM networks can more effectively handle complex ultrahigh-order QAM signals with a simple structure. The proposed approach not only boosts performance but also offers an efficient and scalable solution for nonlinear equalization. The Bi-LSTM network retains the strengths of LSTM by effectively capturing temporal dependencies and integrating gate mechanisms to manage information flow, addressing the gradient vanishing problem. Its bidirectional nature enhances information integration from past and future symbols, mitigating intersymbol interference (ISI) and providing a richer temporal context[14]. The data flow of the Bi-LSTM layer is shown in Fig. 1. The hidden layer of the Bi-LSTM network is essentially two independent LSTM layers, and the input sequences are fed into the two LSTM layers in forward and reverse order, respectively. The feature vectors and extracted by both layers are combined to obtain the output vector.
In addition to performance, it is essential to comprehensively evaluate the complexity of the proposed method. Based on the characteristics of the TCN architecture, the complexity of the structure is outlined,
In general, the number of floating-point operations per second (FLOPS) serves as a standard measure of hardware complexity. Based on the parameterization outlined in this Letter, the complexity of the proposed algorithm is calculated to be 4571 FLOPS. The complexity of FLOPS holds true for all NN approaches because all the computational effort for floating-point data is computed at the hardware level[15].
3. Experimental Setup and Result
Figure 3 depicts the experimental setup for probabilistic-shaping polarization-division multiplexed (PS-PDM) 1024/4096-QAM signals over 80 km of standard single-mode fiber (SSMF) transmission. Due to available hardware, the proposed algorithm is experimentally verified in the 4 GBd 1024/4096-QAM signals. However, it is notable that the experimental architecture and the proposed TFM NN are independent of the signal rate. At the transmitter, a pseudo-random binary sequence is used by the PS encoder to generate the amplitude sequence through bit-to-symbol mapping. Pilot symbols are added for the receiver DSP module. The data undergoes resampling and pulse shaping using a root-raised cosine (RRC) finite impulse response (FIR) filter with a roll-off factor of 0.05. The processed data is fed into an 8-GSa/s, 14-bit arbitrary waveform generator (AWG) for digital-to-analog conversion (DAC). The four-channel electrical signals drive a Mach–Zehnder modulator (MZM) to create the optical 1024-QAM signal. After modulation, the signal is amplified by an erbium-doped fiber amplifier (EDFA). A variable optical attenuator (VOA) varies the input power to the 80-km SSMF, where approximately –4 dBm is optimum for the 4096-QAM signal. A second VOA adjusts the OSNR for bit error rate (BER) measurement, while the third VOA ensures a fixed optical power to the coherent receiver.
Figure 3.Experimental setup of PS-PDM 1024/4096-QAM coherent optical transmission over the 80-km SSMF.
At the receiver, the signal first passes through an optical bandpass filter (OBPF) to suppress amplified spontaneous emission (ASE) noise from the EDFAs. The signal is then sent to a coherent receiver along with the local oscillator (LO). The experiments utilize lasers with a linewidth of less than 1 kHz and a wavelength of 1550.112 nm. The coherent receiver involves two 90-deg optical hybrids and four balanced photodetectors with a bandwidth of 33 GHz. Analog-to-digital conversion is performed based on a 50-GSa/s real-time oscilloscope with a bandwidth of 23 GHz, and the acquired data undergo offline DSP module with data preprocessing, TFM NN equalizer, and symbol decision for signal recovery. The data preprocessing at the receiver includes conventional DSP algorithms such as dispersion compensation, clock recovery, polarization demultiplexing, and carrier recovery.
In this Letter, we assess the performance of PS 1024/4096-QAM signals using ideal-rate adaptive forward error correction (FEC) coding[16,17]. For probabilistically shaped signals with varying entropy values, the NGMI serves as a useful metric for channel measurement. Regardless of the modulation format, a specific soft decision forward error (SD-FEC) scheme can be evaluated. NGMI is defined as[18]
Figures 4(a) and 4(c) illustrate the measured NGMI performance with/without the TFM NN equalizer in back-to-back (BTB) transmission experiments. With the increase of entropy values, the NGMI performance rapidly degrades, as shown by red curves in Figs. 4(a) and 4(c), while, thanks to the outstanding performance of the proposed TFM NN equalizer, the NGMI performance has been improved and reaches the desired NGMI threshold, as shown by blue curves in Figs. 4(a) and 4(c). The constellation diagrams with/without the TFM NN equalizer for 1024-QAM/4096-QAM are inserted in Figs. 4(a) and 4(c); the fuzzy and indistinct constellation diagrams become convergent and distinct with the TFM NN equalizer. The entropy values of 17 bit/symbol and 21.4 bit/symbol, used for 1024-QAM and 4096-QAM, respectively, are highlighted for further fiber transmission experiments.
Figure 4.Measured NGMI performance. (a) NGMI versus H for PS-1024-QAM-BTB; (b) NGMI versus OSNR for PS-1024-QAM-80 km; (c) NGMI versus H for PS-4096-QAM-BTB; (d) NGMI versus OSNR for PS-4096-QAM-80 km.
Figures 4(b) and 4(d) present the measured NGMI performance with the conventional VNLE and the proposed TFM NN in the 80-km SSMF transmission experiments. Similar performance improvements have been observed in Figs. 4(b) and 4(d). In Fig. 4(b), it is evident that at a raw spectral efficiency (SE) of 16.190 bit/s/Hz, the conventional VNLE structure struggles to reach the LDPC threshold, even at an OSNR as high as 36 dB. Specifically, the VNLE algorithm stabilizes around an NGMI convergence value of approximately 0.8. In contrast, the TFM NN equalizer demonstrates a marked improvement, converging closer to an NGMI value of 0.9. The above trend is similarly observed in the 4096-QAM signals. Finally, for the 1024-QAM signal, at the NGMI threshold with 20% LDPC, the raw SE of 16.190 bit/s/Hz has been achieved. For the 4096-QAM signal, at the NGMI threshold with 25% LDPC, the raw SE of 21.188 bit/s/Hz is achieved.
4. Conclusion
In conclusion, we propose a temporal feature-based memory NN for PS-PDM ultrahigh-order QAM coherent optical transmission, and experimentally validate it for PS-PDM 1024/4096-QAM signals over 80-km SSMF transmission. The proposed TFM NN equalizer exploits the temporal convolutional network feature extraction layer to significantly optimize the performance of the bidirectional LSTM network, which enables PS-PDM 1024/4096-QAM signals over 80-km SSMF transmission and achieves the raw spectral efficiencies of 16.190 and 21.188 bit/s/Hz, respectively. Compared to the conventional NNs with the additional linear equalizers, the proposed TFM NN equalizer utilizes a simple structure to effectively compensate for nonlinear distortions of ultra-high-order QAM signals, outperforming conventional Volterra equalizers. The proposed approach offers a cost-effective solution for generating and transmitting ultrahigh-order signals in future commercial high-capacity coherent optical transmission systems.
References
[1] M. Terayama, S. Okamoto, K. Kasai et al. 4096 QAM (72 Gbit/s) single-carrier coherent optical transmission with a potential SE of 15.8 bit/s/Hz in all-raman amplified 160 km fiber link. 2018 Optical Fiber Communications Conference and Exposition (OFC), 1(2018).
[12] S. Bai, J. Kolter, V. Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modelingar(2018).
[13] C. Cortes, M. Mohri, A. Rostamizadeh. L2 regularization for learning kernels(2012).

Set citation alerts for the article
Please enter your email address