Monodepthv2(ICCV 2019)

A paddle implementation of the paper Digging Into Self-Supervised Monocular Depth Estimation
[ICCV2019]

Abstract

Per-pixel ground-truth depth data is challenging to acquire at scale. To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. In this paper, we propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. We show that a surprisingly simple model, and associated design choices, lead to superior predictions. In particular, we propose (i) a minimum reprojection loss, designed to robustly handle occlusions, (ii) a full-resolution multi-scale sampling method that reduces visual artifacts, and (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. We demonstrate the effectiveness of each component in isolation, and show high quality, state-of-the-art results on the KITTI benchmark.

Training

KITTI Datasets Pretraining

Run the script ./configs/monodepthv2/mdp.sh to pre-train on KITTI datsets. Please update --data_path in the bash file as your training data path and specify weights_init as the directory path of backbone weights, i.e., /root/paddlejob/shenzhelun/PaddleMono-master/weights/backbone_weight/resnet18-pytorch.

Finetuning

After training on 640x192 resolution, increase the resolution to 1024x320 for fine-tuning.
Run the script ./configs/monodepthv2/mdp.sh to jointly finetune the pre-train model on KITTI dataset.
Please update --data_path and --load_weights_folder as your training data path and pretrained weights folder.

Evaluation

run the script ./configs/monodepthv2/mdp.sh to evaluate the model.

Models

Pretraining Model

You can use this checkpoint to reproduce the result of Monodepth2_640x192.

Finetuning Model

You can use this checkpoint to reproduce the result of Monodepth2_1024x320.

backbone weights

You can use this checkpoint to load the backbone weights of resnet18.

Please put pretraining model weights and backbone weights in the same directory and specify load_weights_folder
as the directory path of pretraining model weights, i.e., weights/weights_best_640x192/ when running the mdp.sh.

|-- weights/weights_best_640x192
  |-- resnet18_pretrain.h5
  |-- encoder.pdparams
  |-- depth.pdparams
  |-- pose_encoder.pdparams
  |-- pose.pdparams

If you want to put the backbone weights on the other directory, please further specify weights_init as the directory path of backbone weights, i.e., /root/paddlejob/shenzhelun/PaddleMono-master/weights/backbone_weight/resnet18-pytorch

Citation

If you find this code useful in your research, please cite:

@inproceedings{godard2019digging,
  title={Digging into self-supervised monocular depth estimation},
  author={Godard, Cl{\'e}ment and Mac Aodha, Oisin and Firman, Michael and Brostow, Gabriel J},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3828--3838},
  year={2019}
}

3.9 KiB Raw Permalink Blame History