Fig. 1. CTPNet Network Structure
Fig. 2. Comparison of bottleneck structures
Fig. 3. Multi-head self attention structure
Fig. 4. Fitting result of real box and prediction box
Fig. 5. CIoU loss of signal
Fig. 6. NFOD dataset images and annotations
Fig. 7. Target instance scale distribution
Fig. 8. Visualization results of mean average precision
Fig. 9. Test result visualization
Fig. 10. Visualization of characteristic image
参 数 | 值 |
---|
传感器规格 | 高级COMS感光芯片 1/2.7 inch | 像元尺寸 | 3 μm×3 μm | 最低工作照度 | 0.051 lx | 速度 | 30 frame/s | 输出分辨率 | 1 280×720 |
|
Table 1. Parameters of LRCP20680_1080P camera
检测层 | 聚类前 | 聚类后 |
---|
20×20 | (10, 13), (16,30), (33, 23) | (6, 8), (10,15), (12, 24) | 40×40 | (30, 61), (62, 45), (59, 119) | (16, 18), (22, 27), (33, 16) | 80×80 | (116, 90), (156, 198), (373, 326) | (37, 77), (42, 35), (66, 68) |
|
Table 2. Initial candidate box size of detect layers
模 型 | GIoU | K-means | CIoU | Transformer BottleNeck | Weight /MB | Speed/ (frame·s-1) | mAP /% |
---|
YOLOv5+GIoU | √ | | | | 14.4 | 41.8 | 82.9 | YOLOv5+K-means+ GIoU | √ | √ | | | 14.4 | 43.2 | 83.6 | YOLOv5+K-means+CIoU | | √ | √ | | 14.4 | 42.5 | 84.3 | YOLOv5+K-means+CIoU+TransformerBotteNeck | | √ | √ | √ | 14.4 | 38.0 | 88.1 |
|
Table 3. Result of ablation experiments
Model | Speed/ (frame·s-1) | Weight /MB | mAP (%) | Plier (%) | Screwdriver (%) | Strapping_tape (%) | Nail (%) | Sheetmetal (%) | Spanner (%) | Branch (%) | Nut (%) | Block_rubber (%) |
---|
CSPTNet-1H | 41.5 | 14.4 | 87.2 | 91.8 | 80.7 | 83.8 | 81.6 | 90.8 | 92.0 | 75.6 | 89.3 | 87.2 | CSPTNet-2H | 39.4 | 14.4 | 87.2 | 81.7 | 91.0 | 85.7 | 78.5 | 87.2 | 90.0 | 74.4 | 98.0 | 98.1 | CSPTNet-4H | 38.0 | 14.4 | 88.1 | 86.9 | 93.3 | 77.7 | 74.0 | 95.6 | 94.4 | 83.5 | 96.1 | 91.5 | CSPTNet-8H | 28.5 | 14.4 | 87.1 | 82.5 | 86.0 | 92.3 | 83.6 | 94.0 | 90.1 | 66.2 | 89.9 | 87.1 | CSPTNet-16H | 20.6 | 14.4 | 84.1 | 79.6 | 86.1 | 81.2 | 80.4 | 91.1 | 94.5 | 76.6 | 82.5 | 84.7 |
|
Table 4. Comparison of effect of subspace number of self-attentional branches
Model | Speed/ (frame·s-1) | Weight /MB | mAP (%) | Plier (%) | Screwdriver (%) | Strapping_tape (%) | Nail (%) | Sheetmetal (%) | Spanner (%) | Branch (%) | Nut (%) | Block_rubber (%) |
---|
SE | 40.6 | 15 | 76.5 | 77.1 | 62.8 | 70.5 | 77.5 | 90.8 | 85.4 | 72.2 | 57.5 | 94.5 | CoordAtt | 39.0 | 14.5 | 77.6 | 70.3 | 74.5 | 81.6 | 79.4 | 89.7 | 78.6 | 56.6 | 76.2 | 90.6 | CBAM | 42.0 | 14.5 | 79.9 | 67.1 | 82.4 | 80.4 | 77.4 | 92.5 | 86.3 | 65.8 | 75.5 | 91.3 | ChannleAtt | 45.0 | 14.5 | 80.7 | 66.7 | 75.7 | 81.4 | 84.1 | 88.5 | 90.0 | 71.7 | 78.9 | 89.0 | ECA | 42.7 | 14.8 | 83.7 | 69.1 | 85.7 | 83.7 | 81.4 | 89.8 | 87.8 | 72.6 | 84.5 | 98.6 | SAM | 41.8 | 14.5 | 85.4 | 85.9 | 86.1 | 82.5 | 82.9 | 85.8 | 95.1 | 76.4 | 82.4 | 91.5 | MHSA | 38.0 | 14.4 | 88.1 | 86.9 | 93.3 | 77.7 | 74.0 | 95.6 | 94.4 | 83.5 | 96.1 | 91.5 |
|
Table 5. Comparative experiment results of attention mechanism
Model | Speed/(frame·s-1) | Weight /MB | mAP (%) | Plier (%) | Screwdriver (%) | Strapping_tape (%) | Nail (%) | Sheetmetal (%) | Spanner (%) | Branch (%) | Nut (%) | Block_rubber (%) |
---|
YOLOv5-ST | 41.2 | 14.7 | 82.4 | 66.4 | 82.3 | 77.0 | 76.1 | 87.4 | 90.7 | 82.9 | 86.6 | 92.0 | YOLOv5-Ghost | 47.5 | 13.2 | 83.5 | 75.3 | 85.8 | 82.5 | 80.5 | 90.4 | 91.1 | 83.7 | 72.7 | 89.8 | YOLOv5-CSP | 43.9 | 14.6 | 85.0 | 90.3 | 86.7 | 87.0 | 78.1 | 88.9 | 89.1 | 70.3 | 85.4 | 88.9 | CSPTNet | 38.0 | 14.4 | 88.1 | 86.9 | 93.3 | 77.7 | 74.0 | 95.6 | 94.4 | 83.5 | 96.1 | 91.5 |
|
Table 6. Effect comparison of bottleneck modules
Model | Speed/(frame·s-1) | Weight /MB | mAP (%) | Plier (%) | Screwdriver (%) | Strapping_tape (%) | Nail (%) | Sheetmetal (%) | Spanner (%) | Branch (%) | Nut (%) | Block_rubber (%) |
---|
YOLOv3-tiny | 49.7 | 17.4 | 30.3 | 40.5 | 9.0 | 26.4 | 22.2 | 59.3 | 42.4 | 0 | 13.5 | 59.1 | VarifocalNet | 14.9 | 261.4 | 52.8 | 70.7 | 56.7 | 69.5 | 2.8 | 42.8 | 78.2 | 75.3 | 1.4 | 77.4 | Faster R-CNN | 19.9 | 330.6 | 65.6 | 88.9 | 72.0 | 87.3 | 20.7 | 53.7 | 85.4 | 80.6 | 21.7 | 80.5 | Sparse R-CNN | 17.2 | 1300 | 73.0 | 85.3 | 65.7 | 93.8 | 47.1 | 72.8 | 79.1 | 79.3 | 59.2 | 74.5 | TOOD | 16.6 | 255.8 | 75.1 | 84.0 | 81.5 | 90.0 | 49.8 | 62.3 | 90.9 | 80.9 | 60.0 | 81.8 | YOLOx | 14.8 | 71.9 | 78.69 | 92.6 | 81.5 | 97.5 | 56.1 | 87.8 | 98.0 | 83.3 | 23.8 | 87.6 | YOLOv3 | 39.5 | 19.4 | 82.9 | 59.9 | 81.3 | 88.6 | 71.1 | 94.5 | 87.7 | 75.3 | 96.5 | 91.5 | YOLOv5 | 41.8 | 14.4 | 82.9 | 77.8 | 72.4 | 82.5 | 76.6 | 88.5 | 89.8 | 82.3 | 76.2 | 99.5 | Ours | 38.0 | 14.4 | 88.1 | 86.9 | 93.3 | 77.7 | 74 | 95.6 | 94.4 | 83.5 | 96.1 | 91.5 |
|
Table 7. Comparison of model effects