• Laser & Optoelectronics Progress
  • Vol. 60, Issue 12, 1210006 (2023)
Qingming Yi, Wenting Zhang, Min Shi, Jialin Shen, and Aiwen Luo*
Author Affiliations
  • College of Information Science and Technology, Jinan University, Guangzhou 510632, Guangdong, China
  • show less
    DOI: 10.3788/LOP220914 Cite this Article Set citation alerts
    Qingming Yi, Wenting Zhang, Min Shi, Jialin Shen, Aiwen Luo. Semantic Segmentation for Road Scene Based on Multiscale Feature Fusion[J]. Laser & Optoelectronics Progress, 2023, 60(12): 1210006 Copy Citation Text show less
    Overall structure of MIFNet composed of three key modules marked by dashed boxes
    Fig. 1. Overall structure of MIFNet composed of three key modules marked by dashed boxes
    Comparison of different bottleneck modules. (a) Non-bottleneck-1D; (b) DAB module; (c) SS-nbt module
    Fig. 2. Comparison of different bottleneck modules. (a) Non-bottleneck-1D; (b) DAB module; (c) SS-nbt module
    Structure of LEF-B
    Fig. 3. Structure of LEF-B
    Heat maps after different processing in fusion module. (a) Input image; (b) heat map without Laplace operator; (c) heat map with Laplace operator
    Fig. 4. Heat maps after different processing in fusion module. (a) Input image; (b) heat map without Laplace operator; (c) heat map with Laplace operator
    Semantic segmentation results on Cityscapes dataset
    Fig. 5. Semantic segmentation results on Cityscapes dataset
    NetworkSpeed /(frame·s-1Parameters /MmIoU /%GFLOPs
    OriginalLFE-BOriginalLFE-BOriginalLFE-BOriginalLFE-B
    DABNet5106.00104.410.760.6969.170.4611.189.61
    ERFNet758.5741.012.070.7570.071.4926.869.84
    LEDNet858.9472.010.950.6070.670.0011.517.65
    ESNet1951.3951.721.661.3870.771.1224.3514.29
    MIFNet(proposed)73.680.8272.5012.03
    Table 1. Results of different networks with LFE-B
    ConfigurationSpeed /(frame·s-1Parameters /MmIoU /%
    None74.970.8271.71
    3×371.800.8371.65
    Prewitt71.390.8271.79
    Sobel72.740.8272.16
    Laplace73.680.8272.50
    Table 2. Results of different configuration of ESF
    DecoderSpeed /(frame·s-1Parameters /MmIoU /%

    None

    ERFD7

    PAD5

    APN8

    MAFD(proposed)

    88.20

    52.91

    75.23

    67.29

    73.68

    0.77

    1.03

    0.77

    0.78

    0.82

    71.10

    72.10

    71.59

    69.51

    72.50

    Table 3. Results of different decoders on MIFNet
    NetworkPretrainSpeed /(frame·s-1Parameters /MmIoU(test)/%GFLOPs
    ENet6No41.700.3658.34.35
    ESPNet22No146.000.3660.33.50
    CGNet23No44.700.5065.67.00
    ContextNet10No176.600.8865.51.78
    EDANet24No105.500.6867.39.00
    ERFNet7No58.572.0768.026.90
    FastSCNN9No198.411.1062.81.76
    LEDNet8No58.940.9569.211.50
    DABNet5No106.200.6471.210.50
    ESNet19No51.391.6670.724.40
    LRNNet_C14No71.000.6872.28.58
    BiSeNetV1_X20]*ImageNet105.80*5.8068.414.90
    BiSeNetV1_R20]*ImageNet65.50*49.0074.755.30
    BiSeNetV221No156.0072.621.15
    BiSeNetV2_L21No47.3075.3118.51
    MIFNet(proposed)No73.680.8273.112.03
    Table 4. Performance comparison of different network models on Cityscapes test set
    NetworkInput size /pixelSpeed /(frame·s-1Parameters /MmIoU(test)/%GFLOPs
    ENet6360×48061.000.3651.31.44
    ERFNet7360×48064.302.0767.18.80
    DABNet5360×480117.000.6464.63.20
    LEDNet8360×48058.940.9566.611.50
    EKENet13360×48038.001.2067.5
    ESPNet22360×480132.000.3655.61.10
    EDANet24360×480163.000.6866.42.90
    CGNet23360×480112.000.5065.665.60
    LRNNet_C14360×48076.500.6869.2
    BiSeNetV1_X20]*720×960175.00*49.0065.68.70
    BiSeNetV1_R20]*720×960116.30*5.8068.732.40
    BiSeNetV221720×960124.5072.421.15
    BiSeNetV2_L21720×96032.7073.2118.51
    MIFNet(proposed)720×96055.020.8171.115.86
    MIFNet(proposed)360×48085.160.8167.73.90
    Table 5. Performance comparison of different network models on CamVid test set
    Qingming Yi, Wenting Zhang, Min Shi, Jialin Shen, Aiwen Luo. Semantic Segmentation for Road Scene Based on Multiscale Feature Fusion[J]. Laser & Optoelectronics Progress, 2023, 60(12): 1210006
    Download Citation