• Chinese Optics Letters
  • Vol. 23, Issue 5, 050601 (2025)
Xuejing Huang, Mingyi Gao*, Jiamin Fan, Yifan Ge, Xiaodi You, and Gangxiang Shen
Author Affiliations
  • Jiangsu Engineering Research Center of Novel Optical Fiber Technology and Communication Network, Suzhou Key Laboratory of Advanced Optical Communication Network Technology, School of Electronic and Information Engineering, Soochow University, Suzhou 215006, China
  • show less
    DOI: 10.3788/COL202523.050601 Cite this Article Set citation alerts
    Xuejing Huang, Mingyi Gao, Jiamin Fan, Yifan Ge, Xiaodi You, Gangxiang Shen, "Temporal feature-based memory neural network for probabilistic-shaping polarization-division multiplexed ultrahigh-order QAM coherent optical transmission," Chin. Opt. Lett. 23, 050601 (2025) Copy Citation Text show less

    Abstract

    High-speed single-carrier transmission can be achieved by increasing the modulation format cardinality for higher spectral efficiency. However, ultrahigh-order QAM signals are usually more susceptible to various impairments. Hence, we propose a temporal feature-based memory (TFM) neural network (NN) equalizer to effectively mitigate signals’ impairments in ultrahigh-order QAM. The temporal convolutional network is utilized as a feature extraction layer to significantly improve the performance of the bidirectional long short-term memory network. The TFM-NN equalizer was experimentally validated in a probabilistically shaped polarization-division multiplexed 1024/4096-QAM coherent optical transmission system, and raw spectral efficiencies of 16.190 and 21.188 bit/s/Hz have been achieved at normalized generalized mutual information thresholds.

    1. Introduction

    According to Cisco’s 2020 Internet Report, the number of Internet users grew at a compound annual growth rate of around 6% from 2018 to 2023, indicating that future networks will generate even more traffic, thereby making efficient data transmission essential. Coherent optical communication has demonstrated significant advantages in tackling this issue. To enhance spectral efficiency, it is a common approach to increase signal modulation cardinality, with ultrahigh-order signals validated in probabilistic-shaping (PS) experiments involving 1024/4096-quadrature amplitude modulation (QAM)[1,2]. However, the transmission of ultrahigh-order signals often relies on complex hardware and high transmission costs, which are challenging for wide applications. Thus, advanced digital signal processing (DSP) algorithms are becoming indispensable to compensate for signal impairments.

    For ultrahigh-order QAM signals, it is significant to effectively equalize and alleviate various linear and nonlinear noises; nonlinear compensation (NLC) represents one of the most formidable challenges[3]. The Kerr nonlinearity of fiber establishes a fundamental limit on the information capacity of fiber communications as the nonlinear Shannon limit, particularly because of its interaction with the amplified spontaneous emission noise from optical amplifiers. Meanwhile, as the number of constellation points increases in ultrahigh-order QAM signals, the Euclidean distance between them diminishes significantly, and thus high-order signals are more susceptible to noise and linear/nonlinear impairments. Consequently, these ultrahigh-order signals require a higher optical signal-to-noise ratio (OSNR).

    Main nonlinear compensation techniques include digital backpropagation (DBP)[3], Volterra series-based nonlinear equalizers (VNLEs)[4], and phase-conjugated twin waves (PCTWs)[5]. DBP addresses impairments by solving the inverse propagation nonlinear Schrödinger equation (NLSE), but its high complexity limits commercial viability. VNLEs employ Volterra series transfer functions for modeling, which demands considerable computational resources and struggles with high-order nonlinear effects. PCTWs mitigate the first-order nonlinear effects through digital coherent superposition, but they halve spectral efficiency.

    Neural networks (NNs), leveraging their powerful fitting and analytical capabilities, are well suited for nonlinear equalization in fiber channels. Recently, there has been a growing interest in employing machine-learning (ML)[6] techniques to address nonlinear impairments in coherent optical communications. Artificial neural networks (ANNs) for 16-QAM[7], convolutional neural networks (CNNs) for 16-QAM[8], and recurrent neural networks (RNNs)[9] for 64-QAM have all been effectively utilized in coherent optical communication. Moreover, experiments have demonstrated that long short-term memory (LSTM) networks can successfully mitigate impairments in 16-QAM coherent optical transmission systems[10,11], highlighting the promise of LSTM-based postprocessing for handling such challenges. In previous studies, NNs have primarily been applied to lower-order modulation signals, where additional linear equalizers were usually indispensable prior to the NN structure. Ultrahigh-order signals, however, pose significant challenges due to the sharp reduction in Euclidean distance between constellation points, making them extremely sensitive to both linear and nonlinear impairments. Even minor distortions can cause rapid signal degradation, and there has been limited research into the use of NNs for such high-order signals.

    As a result, it is challenging to apply conventional NNs for ultrahigh-order signals in coherent optical transmission systems. In this Letter, we introduce a temporal feature-based memory (TFM) NN equalizer for degradation compensation in ultrahigh-order coherent optical transmission. Experiments of 1024/4096-QAM signals have been demonstrated to verify the outstanding performance of the proposed TFM NN equalizer over Volterra series-based nonlinear equalizers. The raw spectral efficiencies of 16.190 and 21.188 bit/s/Hz have been achieved for 1024-QAM and 4096-QAM signals at the normalized generalized mutual information (NGMI) thresholds of 0.881 (20%) and 0.778 (25%), respectively.

    2. Principle of the Method

    Before delving into the NN architectures, it is essential to clarify the dataset utilized. Signal equalization specifically addresses interference between current symbols and adjacent symbols. The data set consists of multiple sets of complete signals acquired under varying optical signal-to-noise ratio scenarios. The NN uses four features of the signal, i.e., each I and Q component of the X/Y polarization signal. Each signal set comprises the real and imaginary components of the X/Y polarization signal over 10,000 symbols, capturing information from the current symbol and its surrounding context. The training process is conducted offline to ensure computational efficiency during operational phases. To avoid potential bias towards specific pseudo-random binary sequence (PRBS) characteristics, training tests include different PRBS sequences and an early stopping mechanism is incorporated based on mean squared error (MSE) validation results every 25 epochs. Training terminates if MSE stagnates after 250 epochs. All input data undergo zero-mean normalization before entering the network. Here, NNs use MSE as the loss function for regression tasks, which has proven effective in signal processing[10]. The Adam optimizer is employed, starting with an initial learning rate of 0.008, which is reduced to 0.1 every 500 epochs. The training spans 1000 epochs with early stopping. To ensure robustness, the data used for subsequent analysis and plotting are independent of the training set data and generated using different PRBS sequences.

    The architecture of the proposed TFM NN is depicted in Fig. 1. A temporal convolutional network (TCN) is a network layer of TFM NN, especially designed for processing time-series data, which enables convolutional networks to efficiently process sequence data through unique structural adjustments, so that data features can be better extracted. Initially, the signals pass through the TCN layer for feature extraction, which alters the feature dimensions with a constant sequence length. The TFM NN’s architecture is designed to optimize feature extraction for the bidirectional long and short-term memory NN (Bi-LSTM) layer while balancing complexity and retaining the key characteristics of ultrahigh-order signals. The enriched feature set enables the following Bi-LSTM to model complex dependencies and interactions among the signals, which can enhance the nonlinear compensation performance.

    Architecture of temporal feature-based memory network diagram.

    Figure 1.Architecture of temporal feature-based memory network diagram.

    Each residual block in the TCN module integrates two inflationary convolutional layers, expanding the receptive field of the kernel without increasing the parameter count. Such a setup allows each convolutional output to capture a broader range of information. After that, the layer normalization stabilizes inputs at each layer, thereby accelerating training and enhancing generalization. A tanh activation function further speeds convergence and optimizes weight updates, while a dropout layer at the module’s end mitigates overfitting to increase the model’s robustness and homogeneity. The tanh activation function is formulated as Tanh(x)=tanh(x)=exexex+ex.

    The TCN offers enhanced feature extraction by stacking multiple convolutional layers, which exploits filters with varying sizes to capture temporal features across different scales. This multiscale approach allows the TCN to be particularly sensitive to local dependencies in sequential data. Therefore, the TCN can improve nonlinear compensation performance, especially for ultrahigh-order signals, where it is essential to capture intricate timescale dependencies. In the TCN as shown in Fig. 1, dilated convolutions are first utilized to increase the receptive field. Suppose the one-dimensional input sequence xRn and the filter set f:{0,,k1}R. The dilated convolution operation for a sequence element s is defined as F(s)=(xd*f)(s)=i=0k1f(i)·xsd·i.Here, d is the dilation factor, k is the filter size, and sd·i represents the past direction. Dilation corresponds to the introduction of a fixed step between filter taps. When d=1, it reduces to a regular convolution. Larger dilations expand the receptive field without significant computational cost. By increasing filter size k or dilation d, TCN’s receptive field grows. For example, with a convolution kernel of 3 and an exponentially growing dilation factor d=o(2i), the TCN covers all input values. Each layer to the next involves residual blocks, enhancing long-range dependency modeling. Thus, TCN’s dilated convolutions and residual blocks effectively handle long-range dependencies, making it ideal for tasks like DSP in coherent optical transmission. An illustration of a dilated convolution with a kernel size of 3 and a dilation factor n is shown in Fig. 2.

    Schematic of TCN.

    Figure 2.Schematic of TCN.

    The receptive field of a TCN is determined by network depth n, filter size k, and dilation factor d[12]. To ensure stability in deeper TCNs and optimize feature extraction, residual layers are employed, consisting of two causal convolutional layers followed by nonlinear activation layers. Layer normalization of postconvolution is applied to maintain stability, while dropout layers help mitigate overfitting. To match input and output dimensions, a 1×1 convolution is used for proper tensor alignment before element-wise addition. Each convolutional layer in the residual blocks incorporates weight normalization to prevent gradient explosion. To enhance generalization and reduce overfitting, L2 regularization is included in the TCN, constraining the L2 norm of the parameters[13]. In this work, each residual block comprises two convolutional layers with a stride of one. The residual network architecture incorporates shortcut connections, where certain layers can be bypassed to pass the original data directly to the next layer. These additional connections do not increase the model’s complexity.

    By exploiting TCN layers for feature extraction, Bi-LSTM networks can more effectively handle complex ultrahigh-order QAM signals with a simple structure. The proposed approach not only boosts performance but also offers an efficient and scalable solution for nonlinear equalization. The Bi-LSTM network retains the strengths of LSTM by effectively capturing temporal dependencies and integrating gate mechanisms to manage information flow, addressing the gradient vanishing problem. Its bidirectional nature enhances information integration from past and future symbols, mitigating intersymbol interference (ISI) and providing a richer temporal context[14]. The data flow of the Bi-LSTM layer is shown in Fig. 1. The hidden layer of the Bi-LSTM network is essentially two independent LSTM layers, and the input sequences are fed into the two LSTM layers in forward and reverse order, respectively. The feature vectors h={h0,h1,,hn} and h={h0,h1,,hn} extracted by both layers are combined to obtain the output vector.

    In addition to performance, it is essential to comprehensively evaluate the complexity of the proposed method. Based on the characteristics of the TCN architecture, the complexity of the structure is outlined, CTCN=ninfnk(nsnk+1)+nknfnk(nsnk+1)+j=2N{nknfnk[ns2j1(nk1)]+nknfnk[ns2j1(nk1)]},where ni represents features of the original data, nf represents the number of filters, and nk denotes the kernel size. For a TCN, each filter has a size of (nsnk+1). The complexity of a TCN can be described by multiplying the number of features, the number of filters, the kernel size, and the output length Lout after the convolutional layer. We use causal padding, zero-padding at the front of the data, a dilation factor of 1, and a stride of 1. In Eq. (3), the number of features in the first convolutional layer corresponds to the number of features in the original data. In subsequent layers, the number of features is updated corresponding to the number of filters set in the previous convolutional layer. Each residual block contains two convolutional structures, each with different numbers of filters and kernel sizes. To distinguish between them, let nk and nf denote the parameters of the first layer, and nk' and nf' represent the parameters of the second layer. N represents the number of residual blocks. Therefore, the complexity of the proposed TFM NN equalizer can be written as CTFM=ninfnk(nsnk+1)+nfnfnk(nsnk+1)+j=2N{nknfnk[ns2j1(nk1)]+nknfnk[ns2j1(nk1)]}+[ns2N1(nk1)]2nh(4nf+4nh+3+no).

    In general, the number of floating-point operations per second (FLOPS) serves as a standard measure of hardware complexity. Based on the parameterization outlined in this Letter, the complexity of the proposed algorithm is calculated to be 4571 FLOPS. The complexity of FLOPS holds true for all NN approaches because all the computational effort for floating-point data is computed at the hardware level[15].

    3. Experimental Setup and Result

    Figure 3 depicts the experimental setup for probabilistic-shaping polarization-division multiplexed (PS-PDM) 1024/4096-QAM signals over 80 km of standard single-mode fiber (SSMF) transmission. Due to available hardware, the proposed algorithm is experimentally verified in the 4 GBd 1024/4096-QAM signals. However, it is notable that the experimental architecture and the proposed TFM NN are independent of the signal rate. At the transmitter, a pseudo-random binary sequence is used by the PS encoder to generate the amplitude sequence through bit-to-symbol mapping. Pilot symbols are added for the receiver DSP module. The data undergoes resampling and pulse shaping using a root-raised cosine (RRC) finite impulse response (FIR) filter with a roll-off factor of 0.05. The processed data is fed into an 8-GSa/s, 14-bit arbitrary waveform generator (AWG) for digital-to-analog conversion (DAC). The four-channel electrical signals drive a Mach–Zehnder modulator (MZM) to create the optical 1024-QAM signal. After modulation, the signal is amplified by an erbium-doped fiber amplifier (EDFA). A variable optical attenuator (VOA) varies the input power to the 80-km SSMF, where approximately –4 dBm is optimum for the 4096-QAM signal. A second VOA adjusts the OSNR for bit error rate (BER) measurement, while the third VOA ensures a fixed optical power to the coherent receiver.

    Experimental setup of PS-PDM 1024/4096-QAM coherent optical transmission over the 80-km SSMF.

    Figure 3.Experimental setup of PS-PDM 1024/4096-QAM coherent optical transmission over the 80-km SSMF.

    At the receiver, the signal first passes through an optical bandpass filter (OBPF) to suppress amplified spontaneous emission (ASE) noise from the EDFAs. The signal is then sent to a coherent receiver along with the local oscillator (LO). The experiments utilize lasers with a linewidth of less than 1 kHz and a wavelength of 1550.112 nm. The coherent receiver involves two 90-deg optical hybrids and four balanced photodetectors with a bandwidth of 33 GHz. Analog-to-digital conversion is performed based on a 50-GSa/s real-time oscilloscope with a bandwidth of 23 GHz, and the acquired data undergo offline DSP module with data preprocessing, TFM NN equalizer, and symbol decision for signal recovery. The data preprocessing at the receiver includes conventional DSP algorithms such as dispersion compensation, clock recovery, polarization demultiplexing, and carrier recovery.

    In this Letter, we assess the performance of PS 1024/4096-QAM signals using ideal-rate adaptive forward error correction (FEC) coding[16,17]. For probabilistically shaped signals with varying entropy values, the NGMI serves as a useful metric for channel measurement. Regardless of the modulation format, a specific soft decision forward error (SD-FEC) scheme can be evaluated. NGMI is defined as[18]NGMI=1(HGMI)/(4m),where m is 5 for 1024-QAM and 6 for 4096-QAM. H represents the information entropy of the signal. The 20% and 25% low density parity check (LDPC) overheads correspond to NGMI thresholds of 0.881 and 0.778, respectively. Generally, error-free transmission can be achieved for QAM signals, provided that the NGMI exceeds these thresholds.

    Figures 4(a) and 4(c) illustrate the measured NGMI performance with/without the TFM NN equalizer in back-to-back (BTB) transmission experiments. With the increase of entropy values, the NGMI performance rapidly degrades, as shown by red curves in Figs. 4(a) and 4(c), while, thanks to the outstanding performance of the proposed TFM NN equalizer, the NGMI performance has been improved and reaches the desired NGMI threshold, as shown by blue curves in Figs. 4(a) and 4(c). The constellation diagrams with/without the TFM NN equalizer for 1024-QAM/4096-QAM are inserted in Figs. 4(a) and 4(c); the fuzzy and indistinct constellation diagrams become convergent and distinct with the TFM NN equalizer. The entropy values of 17 bit/symbol and 21.4 bit/symbol, used for 1024-QAM and 4096-QAM, respectively, are highlighted for further fiber transmission experiments.

    Measured NGMI performance. (a) NGMI versus H for PS-1024-QAM-BTB; (b) NGMI versus OSNR for PS-1024-QAM-80 km; (c) NGMI versus H for PS-4096-QAM-BTB; (d) NGMI versus OSNR for PS-4096-QAM-80 km.

    Figure 4.Measured NGMI performance. (a) NGMI versus H for PS-1024-QAM-BTB; (b) NGMI versus OSNR for PS-1024-QAM-80 km; (c) NGMI versus H for PS-4096-QAM-BTB; (d) NGMI versus OSNR for PS-4096-QAM-80 km.

    Figures 4(b) and 4(d) present the measured NGMI performance with the conventional VNLE and the proposed TFM NN in the 80-km SSMF transmission experiments. Similar performance improvements have been observed in Figs. 4(b) and 4(d). In Fig. 4(b), it is evident that at a raw spectral efficiency (SE) of 16.190 bit/s/Hz, the conventional VNLE structure struggles to reach the LDPC threshold, even at an OSNR as high as 36 dB. Specifically, the VNLE algorithm stabilizes around an NGMI convergence value of approximately 0.8. In contrast, the TFM NN equalizer demonstrates a marked improvement, converging closer to an NGMI value of 0.9. The above trend is similarly observed in the 4096-QAM signals. Finally, for the 1024-QAM signal, at the NGMI threshold with 20% LDPC, the raw SE of 16.190 bit/s/Hz has been achieved. For the 4096-QAM signal, at the NGMI threshold with 25% LDPC, the raw SE of 21.188 bit/s/Hz is achieved.

    4. Conclusion

    In conclusion, we propose a temporal feature-based memory NN for PS-PDM ultrahigh-order QAM coherent optical transmission, and experimentally validate it for PS-PDM 1024/4096-QAM signals over 80-km SSMF transmission. The proposed TFM NN equalizer exploits the temporal convolutional network feature extraction layer to significantly optimize the performance of the bidirectional LSTM network, which enables PS-PDM 1024/4096-QAM signals over 80-km SSMF transmission and achieves the raw spectral efficiencies of 16.190 and 21.188 bit/s/Hz, respectively. Compared to the conventional NNs with the additional linear equalizers, the proposed TFM NN equalizer utilizes a simple structure to effectively compensate for nonlinear distortions of ultra-high-order QAM signals, outperforming conventional Volterra equalizers. The proposed approach offers a cost-effective solution for generating and transmitting ultrahigh-order signals in future commercial high-capacity coherent optical transmission systems.

    References

    [1] M. Terayama, S. Okamoto, K. Kasai et al. 4096 QAM (72 Gbit/s) single-carrier coherent optical transmission with a potential SE of 15.8 bit/s/Hz in all-raman amplified 160 km fiber link. 2018 Optical Fiber Communications Conference and Exposition (OFC), 1(2018).

    [2] S. L. I. Olsson, J. Cho, S. Chandrasekhar et al. Probabilistically shaped PDM 4096-QAM transmission over up to 200 km of fiber using standard intradyne detection. Opt. Express, 26, 4522(2018).

    [3] E. Ip. Nonlinear compensation using backpropagation for polarization-multiplexed transmission. J. Lightwave Technol., 28, 939(2010).

    [4] L. Liu, L. Li, Y. Huang et al. Intrachannel nonlinearity compensation by inverse Volterra series transfer function. J. Lightwave Technol., 30, 310(2012).

    [5] J. S. Tavares, L. M. Pessoa, H. M. Salgado. Nonlinear compensation assessment in few-mode fibers via phase-conjugated twin waves. J. Lightwave Technol., 35, 4072(2017).

    [6] A. S. Kashi, Q. Zhuge, J. C. Cartledge et al. Nonlinear signal-to-noise ratio estimation in coherent optical fiber transmission systems using artificial neural networks. J. Lightwave Technol., 36, 5424(2018).

    [7] P. J. Freire, D. Abode, J. E. Prilepsky et al. Transfer learning for neural networks-based equalizers in coherent optical systems. J. Lightwave Technol., 39, 6733(2021).

    [8] P. J. Freire, Y. Osadchuk, B. Spinnler et al. Performance versus complexity study of neural network equalizers in coherent optical systems. J. Lightwave Technol., 39, 6085(2021).

    [9] X. Liu, Y. Wang, X. Wang et al. Bi-directional gated recurrent unit neural network based nonlinear equalizer for coherent optical communication system. Opt. Express, 29, 5923(2021).

    [10] H. Ming, X. Chen, X. Fang et al. Ultralow complexity long short-term memory network for fiber nonlinearity mitigation in coherent optical communication systems. J. Lightwave Technol., 40, 2427(2022).

    [11] S. Deligiannidis, A. Bogris, C. Mesaritakis et al. Compensation of fiber nonlinearities in digital coherent systems leveraging long short-term memory neural networks. J. Lightwave Technol., 38, 5991(2020).

    [12] S. Bai, J. Kolter, V. Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modelingar(2018).

    [13] C. Cortes, M. Mohri, A. Rostamizadeh. L2 regularization for learning kernels(2012).

    [14] X. Dai, X. Li, M. Luo et al. LSTM networks enabled nonlinear equalization in 50-Gb/s PAM-4 transmission links. Appl. Opt., 58, 6079(2019).

    [15] B. Sang, W. Zhou, Y. Tan et al. Low complexity neural network equalization based on multi-symbol output technique for 200+ Gbps IM/DD short reach optical system. J. Lightwave Technol., 40, 2890(2022).

    [16] E. E. Ebrahim, B. B. Yousif. Performance evaluation and enhancement of the modified OOK based IM/DD techniques for hybrid fiber/FSO communication over WDM-PON systems. Opt. Quantum Electron., 52, 385(2020).

    [17] E. E. Ebrahim, B. B. Yousif. Performance enhancement of M-ary pulse-position modulation for a wavelength division multiplexing free-space optical systems impaired by interchannel crosstalk, pointing error, and ASE noise. Opt. Commun., 475, 126219(2020).

    [18] X. Shi, M. Gao, X. Huang et al. Optimized pilot structure for PS-PDM ultrahigh-order QAM coherent optical transmission. Opt. Lett., 49, 1579(2024).

    Xuejing Huang, Mingyi Gao, Jiamin Fan, Yifan Ge, Xiaodi You, Gangxiang Shen, "Temporal feature-based memory neural network for probabilistic-shaping polarization-division multiplexed ultrahigh-order QAM coherent optical transmission," Chin. Opt. Lett. 23, 050601 (2025)
    Download Citation