• Laser & Optoelectronics Progress
  • Vol. 56, Issue 8, 081501 (2019)
Jianzhong Yuan1, Wujie Zhou1,2,*, Ting Pan1, and Pengli Gu1
Author Affiliations
  • 1 School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou, Zhejiang 310023, China
  • 2 College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, Zhejiang 310027, China
  • show less
    DOI: 10.3788/LOP56.081501 Cite this Article Set citation alerts
    Jianzhong Yuan, Wujie Zhou, Ting Pan, Pengli Gu. Road Scene Depth Estimation Based on Deep Convolutional Neural Networks[J]. Laser & Optoelectronics Progress, 2019, 56(8): 081501 Copy Citation Text show less
    References

    [1] Wang F, Chen C, Huang J X. A review of research on driverless vehicles[J]. China Water Transport, 16, 126-128(2016).

    [2] Silver D, van Hasselt H, Hessel M et al. -07-20)[2018-09-30][EB/OL]. planning., org/abs/1612, 08810(2017). https://arxiv.

    [3] Scharstein D, Szeliski R, Zabih R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. [C]∥Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), December 9-10, 2001, Kauai, HI, USA. New York: IEEE, 131-140(2001).

    [4] Flynn J, Neulander I, Philbin J. et al. Deep stereo: Learning to predict new views from the world's imagery. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 5515-5524(2016).

    [5] Saxena A, Chung S H, Ng A Y. Learning depth from single monocular images[S.l.: s.n.], 18, 1161-1168(2005).

    [6] Saxena A, Sun M, Ng A Y. Make3D: learning 3D scene structure from a single still image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 824-840(2009). http://dl.acm.org/citation.cfm?id=1525780

    [7] Hoiem D, Efros A A, Hebert M. Recovering surface layout from an image[J]. International Journal of Computer Vision, 75, 151-172(2007). http://link.springer.com/article/10.1007/s11263-006-0031-y

    [8] Ladicky L, Shi J B, Pollefeys M. Pulling things out of perspective. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 89-96(2014).

    [9] Choi S, Min D B, Ham B. et al. Depth analogy: data-driven approach for single image depth estimation using gradient samples[J]. IEEE Transactions on Image Processing, 24, 5953-5966(2015). http://ieeexplore.ieee.org/document/7308054/

    [10] Konrad J, Wang M, Ishwar P. et al. Learning-based, automatic 2D-to-3D image and video conversion[J]. IEEE Transactions on Image Processing, 22, 3485-3496(2013). http://www.ncbi.nlm.nih.gov/pubmed/23799697

    [11] Baig M H, Torresani L. Coupled depth learning. [C]∥2016 IEEE Winter Conference on Applications of Computer Vision (WACV), March 7-10, 2016, Lake Placid, NY, USA. New York: IEEE, 1-10(2016).

    [12] Shi J P, Tao X, Xu L. et al. Break Ames room illusion[J]. ACM Transactions on Graphics, 34, 1-11(2015).

    [13] Ranftl R, Vineet V, Chen Q F. et al. Dense monocular depth estimation in complex dynamic scenes. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 4058-4066(2016).

    [14] Furukawa R, Sagawa R, Kawasaki H. Depth estimation using structured light flow: Analysis of projected pattern flow on an Object's surface. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 4650-4658(2017).

    [15] Häne C, Ladicky L, Pollefeys M. Direction matters: Depth estimation with a surface normal classifier. [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE, 381-389(2015).

    [16] You X G, Li Q, Tao D C. et al. Local metric learning for exemplar-based object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 24, 1265-1276(2014). http://ieeexplore.ieee.org/document/6739098/

    [17] Zhuo W, Salzmann M, He X M. et al. Indoor scene structure analysis for single image depth estimation. [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE, 614-622(2015).

    [18] Liu M M, Salzmann M, He X M. Discrete-continuous depth estimation from a single image. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 716-723(2014).

    [19] Karsch K, Liu C, Kang S B. Depth transfer: depth extraction from video using non-parametric sampling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 2144-2158(2014). http://www.ncbi.nlm.nih.gov/pubmed/26353057

    [20] Oliva A, Torralba A. Modeling the shape of the scene: A holistic representation of the spatial envelope[J]. International Journal of Computer Vision, 42, 145-175(2001). http://www.tandfonline.com/servlet/linkout?suffix=cit0015&dbid=16&doi=10.1080%2F0952813X.2015.1020572&key=10.1023%2FA%3A1011139631724

    [21] Simonyan K. -04-10)[2018-09-30][EB/OL]. Zisserman A. Very deep convolutional networks for large-scale image recognition., org/abs/1409, 1556(2015). https://arxiv.

    [22] He K M, Zhang X Y, Ren S Q. et al. Deep residual learning for image recognition. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 770-778(2016).

    [23] Xu L, Zhao H T, Sun S Y. The Predictron: End-to-end learning and planning[J]. Acta Optica Sinica, 36, 0715002(2016).

    [24] Eigen D, Puhrsch C. -06-09)[2018-09-30][EB/OL]. Fergus R. Depth map prediction from a single image using a multi-scale deep network.(2014). https://arxiv.org/pdf/1406.2283v1.pdf.

    [25] Eigen D, Fergus R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE, 2650-2658(2015).

    [26] Xie J Y, Girshick R, Farhadi A[M]. Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks, 842-857(2016).

    [27] Li S M, Lei G Q, Fan R. Depth map super-resolution reconstruction based on convolutional neural networks[J]. Acta Optica Sinica, 37, 1210002(2017).

    [28] Wu S C, Zhao H T, Sun S Y. Depth estimation from monocular infrared video based on Bi-recursive convolutional neural network[J]. Acta Optica Sinica, 37, 1215003(2017).

    [29] Xu R, Zhang J G, Huang K Q. Image super-resolution using two-channel convolutional neural networks[J]. Journal of Image and Graphics, 21, 556-564(2016).

    [30] Liu F Y, Shen C H, Lin G S. et al. Learning depth from single monocular images using deep convolutional neural fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 2024-2039(2016). http://dl.acm.org/citation.cfm?id=3026801.3026841

    [31] Li B, Shen C H, Dai Y C et al. Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE, 1119-1127(2015).

    [32] Wang P, Shen X H, Lin Z. et al. Towards unified depth and semantic prediction from a single image. [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE, 2800-2809(2015).

    [33] Li B, Dai Y C, He M Y. Monocular depth estimation with hierarchical fusion of dilated CNNs and soft-weighted-sum inference[J]. Pattern Recognition, 83, 328-339(2018). http://www.sciencedirect.com/science/article/pii/S0031320318302097

    [34] Li J, Klein R, Yao A. A two-streamed network for estimating fine-scaled depth maps from single RGB images. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 3392-3400(2017).

    [35] Lee J H, Heo M, Kim K R. et al. Single-image depth estimation based on Fourier domain analysis. [C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA. New York: IEEE, 330-339(2018).

    [36] Ummenhofer B, Zhou H Z, Uhrig J. et al. DeMoN: depth and motion network for learning monocular stereo. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 5622-5631(2017).

    [37] Fu H, Gong M M, Wang C H. et al. Deep ordinal regression network for monocular depth estimation. [C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA. New York: IEEE, 2002-2011(2018).

    [38] Jégou S, Drozdzal M, Vazquez D. et al. The one hundred layers tiramisu: fully convolutional DenseNets for semantic segmentation. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 1175-1183(2017).

    [39] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 60, 84-90(2017). http://dl.acm.org/citation.cfm?id=2999257

    [40] Kingma D P. -01-30)[2018-09-30][EB/OL]. Ba J. Adam: A method for stochastic optimization., org/abs/1412, 6980(2017). https://arxiv.

    [41] Laina I, Rupprecht C, Belagiannis V. et al. Deeper depth prediction with fully convolutional residual networks. [C]∥2016 Fourth International Conference on 3D Vision (3DV), October 25-28, 2016, Stanford, CA, USA. New York: IEEE, 239-248(2016).

    [42] Yin X C, Wang X W, Du X G. et al. Scale recovery for monocular visual odometry using depth estimated with deep convolutional neural fields. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 5871-5879(2017).

    [43] Dimitrievski M, Goossens B, Veelaert P. et al. High resolution depth reconstruction from monocular images and sparse point clouds using deep convolutional neural network[J]. Proceedings of SPIE, 10410, 104100H(2017). http://spie.org/Publications/Proceedings/Paper/10.1117/12.2273959

    [44] Mancini M, Costante G, Valigi P. et al. Toward domain independence for learning-based monocular depth estimation[J]. IEEE Robotics and Automation Letters, 2, 1778-1785(2017). http://ieeexplore.ieee.org/document/7829276/