Self-Supervised Monocular Depth Hints(ICCV 2019)
A paddle implementation of the paper Self-Supervised Monocular Depth Hints.
[ICCV 2019]
Abstract
Monocular depth estimators can be trained with various forms of self-supervision from binocular-stereo data to circumvent the need for high-quality laser-scans or other ground-truth data. The disadvantage, however, is that the photometric reprojection losses used with self-supervised learning typically have multiple local minima. These plausible-looking alternatives to ground-truth can restrict what a regression network learns, causing it to predict depth maps of limited quality. As one prominent example, depth discontinuities around thin structures are often incorrectly estimated by current state-of-the-art methods. Here, we study the problem of ambiguous reprojections in depth-prediction from stereo-based self-supervision, and introduce Depth Hints to alleviate their effects. Depth Hints are complementary depth-suggestions obtained from simple off-the-shelf stereo algorithms. These hints enhance an existing photometric loss function, and are used to guide a network to learn better weights. They require no additional data, and are assumed to be right only sometimes. We show that using our Depth Hints gives a substantial boost when training several leading self-supervised-from-stereo models, not just our own. Further, combined with other good practices, we produce state-of-the-art depth predictions on the KITTI benchmark.
Training
The code for Depth Hints builds upon Monodepth2.
To train using depth hints:
- Clone this repository
- Run
python precompute_depth_hints.py --data_path <your_KITTI_path>
, optionally setting --save_path
(will default to <data_path>/depth_hints) and --filenames
(will default to training and validation images for the eigen split). This will create the "fused" depth hints referenced in the paper. This process takes approximately 7 hours on a Tesla V100 GPU.
- Add the flag
--use_depth_hints
to your usual monodepth2 training command, optionally also setting --depth_hint_path
(will default to <data_path>/depth_hints). See below for a full command.
KITTI Datasets Pretraining
Run the script ./configs/depth_hints/depth_hints.sh
to pre-train on KITTI datsets. Please update --data_path
in the bash file as your training data path and specify weights_init
as the directory path of backbone weights, i.e., /root/paddlejob/shenzhelun/PaddleMono-master/weights/backbone_weight/resnet18-pytorch
.
Finetuning
After training on 640x192 resolution, increase the resolution to 1024x320 for fine-tuning.
Run the script ./configs/depth_hints/depth_hints.sh
to jointly finetune the pre-train model on KITTI dataset.
Please update --data_path
and --load_weights_folder
as your training data path and pretrained weights folder.
Evaluation
run the script ./configs/depth_hints/depth_hints.sh
to evaluate the model.
Models
Pretraining Model
You can use this checkpoint to reproduce the result of depth_hints_640x192.
Finetuneing Model
You can use this checkpoint to reproduce the result of depth_hints_1024x320.
backbone weights
You can use this checkpoint to load the backbone weights of resnet18.
Please put pretraining model weights and backbone weights in the same directory and specify load_weights_folder
as the directory path of pretraining model weights, i.e., weights/weights_best_640x192/
when running the depth_hints.sh
.
|-- weights/weights_best_640x192
|-- resnet18_pretrain.h5
|-- encoder.pdparams
|-- depth.pdparams
|-- pose_encoder.pdparams
|-- pose.pdparams
If you want to put the backbone weights on the other directory, please further specify weights_init
as the directory path of backbone weights, i.e., /root/paddlejob/shenzhelun/PaddleMono-master/weights/backbone_weight/resnet18-pytorch
Citation
If you find this code useful in your research, please cite:
@inproceedings{watson-2019-depth-hints,
title = {Self-Supervised Monocular Depth Hints},
author = {Jamie Watson and
Michael Firman and
Gabriel J. Brostow and
Daniyar Turmukhambetov},
booktitle = {The International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}