Three-Dimensional Pedestrian Detection by Fusing Image Semantics and Point Cloud Spatial Visibility Features

Lu Xiong; Zhenwen Deng; Wei Tian; Zhiang Wang

doi:10.3788/LOP220712

Journals >Laser & Optoelectronics Progress >Volume 60 >Issue 2 >Page 0228011 > Article

Laser & Optoelectronics Progress
Vol. 60, Issue 2, 0228011 (2023)

Three-Dimensional Pedestrian Detection by Fusing Image Semantics and Point Cloud Spatial Visibility Features

Lu Xiong, Zhenwen Deng, Wei Tian^*, and Zhiang Wang

Author Affiliations

School of Automotive Studies, Tongji University, Shanghai 201804, China

show less

DOI: 10.3788/LOP220712 Cite this Article Set citation alerts

Lu Xiong, Zhenwen Deng, Wei Tian, Zhiang Wang. Three-Dimensional Pedestrian Detection by Fusing Image Semantics and Point Cloud Spatial Visibility Features[J]. Laser & Optoelectronics Progress, 2023, 60(2): 0228011 Copy Citation Text

show less

Fig. 1. Schematic of ray traversing grid

Download full size

Fig. 2. Logic diagram of 2D Raycasting algorithm

Download full size

Fig. 3. Overall framework of fusion network

Download full size

Fig. 4. Semantic segmentation and point feature enhancement

Download full size

Fig. 5. Geometric feature and semantic feature encoding

Download full size

Fig. 6. Spatial visibility feature encoding

Download full size

Fig. 7. Feature fusion and detection heads

Download full size

Fig. 8. Visibility feature visualization. (a) BEV of point cloud; (b) single layer feature

Download full size

Fig. 9. Comparison results of pedestrian detection (example 1). (a) (c) Benchmark results; (b) (d) results obtained by proposed method

Download full size

Fig. 10. Comparison results of pedestrian detection (example 2). (a) (c) Benchmark results; (b) (d) results obtained by proposed method

Download full size

Visibility code

［U，O，F］

3D pedestrian detection AP /%

mAP /%

Easy

Moderate

Hard

［0，1，-1］

68.76

62.49

57.74

63.00

［0.5，0.7，0.4］

70.84

64.57

59.77

65.06

Table 1. Performance comparison of different visibility description methods on KITTI dataset

Number of channels	3D pedestrian detection AP /%				mAP /%
Number of channels	0.5 m	1.0 m	2.0 m	4.0 m	mAP /%
1	62.24	64.36	66.36	68.78	65.44
32	69.68	71.87	73.73	75.97	72.81

Table 2. Performance comparison of different density along height direction on nuScenes dataset

Nunber of frames	3D pedestrian detection AP /%				mAP /%
Nunber of frames	0.5 m	1.0 m	2.0 m	4.0 m	mAP /%
1	37.46	38.22	39.29	40.40	38.84
10	68.36	70.63	72.30	74.59	71.47

Table 3. Performance comparison of different number of frames on nuScenes dataset

Method	3D pedestrian detection AP /%			mAP /%
Method	Easy	Moderate	Hard	mAP /%
PP（Baseline）	70.16	63.40	57.49	63.68
PP+Vis.	71.86	64.49	58.72	65.02
PP+Img.	72.08	65.56	60.34	65.99
PP+Vis.+Img.	71.84	65.65	60.81	66.10

Table 4. Performance comparison of different optimized methods on KITTI validation set

Method	3D pedestrian detection AP /%			Speed /Hz
Method	Easy	Moderate	Hard	Speed /Hz
VoxelNet^［7］	39.48	33.69	31.51	4.4
AVOD^［3］	36.10	27.86	25.76	10
SECOND^［18］	51.07	42.56	37.29	20
F-PointNet^［4］	50.53	42.15	38.08	5.9
PointPainting^［6］	50.32	40.97	37.87	2.5
PointRCNN^［19］	47.98	39.37	36.01	10
Proposed method	51.07	41.36	37.83	30

Table 5. Performance comparison of different methods on KITTI test set

Download Citation

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information

微信扫一扫：分享

微信扫一扫：分享