• Optics and Precision Engineering
  • Vol. 32, Issue 24, 3603 (2024)
Qiqi KOU1, Weichen WANG2, Chenggong HAN2, Chen LÜ2, Deqiang CHENG2, and Yucheng JI3、*
Author Affiliations
  • 1School of Computer Science and Technology,China University of Mining and Technology, Xuzhou226,China
  • 2School of Information and Control Engineering,China University of Mining and Technology, Xuzhou1116,China
  • 3Department Big Data Center,Ministry of Emergency Management, Beijing10001, China
  • show less
    DOI: 10.37188/OPE.20243224.3603 Cite this Article
    Qiqi KOU, Weichen WANG, Chenggong HAN, Chen LÜ, Deqiang CHENG, Yucheng JI. Multi-frame self-supervised monocular depth estimation with multi-scale feature enhancement[J]. Optics and Precision Engineering, 2024, 32(24): 3603 Copy Citation Text show less
    References

    [1] X R WU, Q W XUE. 3D vehicle detection for unmanned driving systerm based on lidar. Opt. Precision Eng., 30, 489-497(2022).

         伍锡如, 薛其威. 基于激光雷达的无人驾驶系统三维车辆检测. 光学 精密工程, 30, 489-497(2022).

    [2] 史晓刚, 薛正辉, 李会会. 增强现实显示技术综述. 中国光学, 14, 1146-1161(2021).

         X G SHI, Z H XUE, H H LI et al. Review of augmented reality display technology. Chinese Optics, 14, 1146-1161(2021).

    [3] H B YAN, F Q XU, L /LÜ)E HUANG et al. Review of multi-view stereo reconstruction methods based on deep learning. Opt. Precision Eng., 31, 2444-2464(2023).

         鄢化彪, 徐方奇, 黄绿娥. 基于深度学习的多视图立体重建方法综述. 光学 精密工程, 31, 2444-2464(2023).

    [4] R GARG, G CARNEIRO et al. Unsupervised CNN for single view depth estimation: geometry to the rescue, 740-756(2016).

    [5] T H ZHOU, M BROWN, N SNAVELY et al. Unsupervised learning of depth and ego-motion from video, 6612-6619(2017).

    [6] C GODARD, OMAC AODHA, G J BROSTOW. Unsupervised monocular depth estimation with left-right consistency, 6602-6611(2017).

    [7] J W BIAN, Z C LI, N WANG et al. Unsupervised scale-consistent depth and ego-motion learning from monocular video, 1-12(2019).

    [8] C GODARD, OMAC AODHA, M FIRMAN et al. Digging into self-supervised monocular depth estimation, 3827-3837(2019).

    [9] Y YAO, Z X LUO, S W LI et al. MVSNet: depth inference for unstructured multi-view stereo, 785-801(2018).

    [10] F WIMBAUER, N YANG, L VON STUMBERG et al. MonoRec: semi-supervised dense reconstruction in dynamic environments from a single moving camera, 6108-6118(2021).

    [11] T SCHÖPS, J L SCHÖNBERGER, S GALLIANI et al. A multi-view stereo benchmark with high-resolution images and multi-camera videos, 2538-2547(2017).

    [12] A KNAPITSCH, J PARK, Q Y ZHOU et al. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics, 36, 1-13(2017).

    [13] J WATSON, OMAC AODHA, V PRISACARIU et al. The temporal opportunist: self-supervised multi-frame monocular depth, 1164-1174(2021).

    [14] Z Y FENG, L YANG, L L JING et al. Disentangling object motion and occlusion for unsupervised multi-frame monocular depth, 228-244(2022).

    [15] S W SHAO, Z C PEI, W H CHEN et al. SMUDLP: Self-Teaching Multi-Frame Unsupervised Endoscopic Depth Estimation with Learnable Patchmatch. arXiv, 2205-15034(2022). http://arxiv.org/abs/2205.15034

    [16] X F WANG, Z ZHU, G HUANG et al. Crafting monocular cues and velocity guidance for self-supervised multi-frame depth learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 2689-2697(2023).

    [17] R LI, D GONG, W YIN et al. Learning to fuse monocular and multi-view cues for multi-frame depth estimation in dynamic scenes, 21539-21548(2023).

    [18] M H GUO, C Z LU, Z N LIU et al. Visual attention network. Computational Visual Media, 9, 733-752(2023).

    [19] W Z LIU, H LU, H T FU et al. Learning to upsample by learning to sample, 6004-6014(2023).

    [20] K M HE, X Y ZHANG, S Q REN et al. Deep residual learning for image recognition, 770-778(2016).

    [21] A DOSOVITSKIY, P FISCHER et al. FlowNet: learning optical flow with convolutional networks, 2758-2766(2015).

    [22] W H WANG, E Z XIE, X LI et al. PVT v2: improved baselines with pyramid vision transformer. Computational Visual Media, 8, 415-424(2022).

    [23] J X YAN, H ZHAO, P H BU et al. Channel-wise attention-based network for self-supervised monocular depth estimation, 464-473(2021).

    [24] D EIGEN, R FERGUS. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture, 2650-2658(2015).

    [25] D EIGEN, C PUHRSCH, R FERGUS. Depth map prediction from a single image using a multi-scale deep network(2014).

    [26] A JOHNSTON, G CARNEIRO. Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume, 4755-4764(2020).

    [27] V GUIZILINI, R AMBRUS, S PILLAI et al. 3D packing for self-supervised monocular depth estimation. usa, 2482-2491(2020).

    [28] J XIANG, Y WANG, L F AN et al. Visual attention-based self-supervised absolute depth estimation using geometric priors in autonomous driving. IEEE Robotics and Automation Letters, 7, 11998-12005(2022).

    [29] Z K SURI. Pose Constraints for consistent self-supervised monocular depth and ego-motion, 340-353(2023).

    [30] H BOULAHBAL, A VOICILA, A COMPORT. STDepthFormer: Predicting Spatio-Temporal Depth from Video with A Self-Supervised Transformer Model. arXiv, 2303-01196(2023). http://arxiv.org/abs/2303.01196

    [31] V PATIL, W VAN GANSBEKE, D X DAI et al. Don’t forget the past: recurrent depth estimation from monocular video. IEEE Robotics and Automation Letters, 5, 6813-6820(2020).

    [32] K SAUNDERS, G VOGIATZIS, L J MANSO. Self-supervised monocular depth estimation: Let'S talk about the weather, 8873-8883(2023).

    [33] J R WANG, G ZHANG, Z Y WU et al. Self-supervised Joint Learning Framework of Depth Estimation Via Implicit Cues. arXiv, 2006-09876(2020). http://arxiv.org/abs/2006.09876

    [34] C SHU, K YU, Z X DUAN et al. Feature-metric Loss for self-supervised learning of depth and egomotion, 572-588(2020).

    [35] H H LI, A GORDON, H ZHAO et al. Unsupervised monocular depth learning in dynamic scenes, 1908-1917(2021).

    Qiqi KOU, Weichen WANG, Chenggong HAN, Chen LÜ, Deqiang CHENG, Yucheng JI. Multi-frame self-supervised monocular depth estimation with multi-scale feature enhancement[J]. Optics and Precision Engineering, 2024, 32(24): 3603
    Download Citation