This is the reference PyTorch implementation for training and testing depth estimation models using the method described in
> **Digging into Self-Supervised Monocular Depth Prediction**
>
> [Clément Godard](http://www0.cs.ucl.ac.uk/staff/C.Godard/), [Oisin Mac Aodha](http://vision.caltech.edu/~macaodha/), [Michael Firman](http://www.michaelfirman.co.uk) and [Gabriel J. Brostow](http://www0.cs.ucl.ac.uk/staff/g.brostow/)
>
> [ICCV 2019 (arXiv pdf)](https://arxiv.org/abs/1806.01260)
<p align="center">
<img src="assets/teaser.gif" alt="example input output gif" width="600" />
</p>
This code is for non-commercial use; please see the [license file](LICENSE) for terms.
If you find our work useful in your research please consider citing our paper:
```
@article{monodepth2,
title = {Digging into Self-Supervised Monocular Depth Prediction},
author = {Cl{\'{e}}ment Godard and
Oisin {Mac Aodha} and
Michael Firman and
Gabriel J. Brostow},
booktitle = {The International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}
```
## âï¸ Setup
Assuming a fresh [Anaconda](https://www.anaconda.com/download/) distribution, you can install the dependencies with:
```shell
conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch
pip install tensorboardX==1.4
conda install opencv=3.3.1 # just needed for evaluation
```
We ran our experiments with PyTorch 0.4.1, CUDA 9.1, Python 3.6.6 and Ubuntu 18.04.
We have also successfully trained models with PyTorch 1.0, and our code is compatible with Python 2.7. You may have issues installing OpenCV version 3.3.1 if you use Python 3.7, we recommend to create a virtual environment with Python 3.6.6 `conda create -n monodepth2 python=3.6.6 anaconda `.
<!-- We recommend using a [conda environment](https://conda.io/docs/user-guide/tasks/manage-environments.html) to avoid dependency conflicts.
We also recommend using `pillow-simd` instead of `pillow` for faster image preprocessing in the dataloaders. -->
## ð¼ï¸ Prediction for a single image
You can predict scaled disparity for a single image with:
```shell
python test_simple.py --image_path assets/test_image.jpg --model_name mono+stereo_640x192
```
or, if you are using a stereo-trained model, you can estimate metric depth with
```shell
python test_simple.py --image_path assets/test_image.jpg --model_name mono+stereo_640x192 --pred_metric_depth
```
On its first run either of these commands will download the `mono+stereo_640x192` pretrained model (99MB) into the `models/` folder.
We provide the following options for `--model_name`:
| `--model_name` | Training modality | Imagenet pretrained? | Model resolution | KITTI abs. rel. error | delta < 1.25 |
|-------------------------|-------------------|--------------------------|-----------------|------|----------------|
| [`mono_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_640x192.zip) | Mono | Yes | 640 x 192 | 0.115 | 0.877 |
| [`stereo_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_640x192.zip) | Stereo | Yes | 640 x 192 | 0.109 | 0.864 |
| [`mono+stereo_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_640x192.zip) | Mono + Stereo | Yes | 640 x 192 | 0.106 | 0.874 |
| [`mono_1024x320`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_1024x320.zip) | Mono | Yes | 1024 x 320 | 0.115 | 0.879 |
| [`stereo_1024x320`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_1024x320.zip) | Stereo | Yes | 1024 x 320 | 0.107 | 0.874 |
| [`mono+stereo_1024x320`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_1024x320.zip) | Mono + Stereo | Yes | 1024 x 320 | 0.106 | 0.876 |
| [`mono_no_pt_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_no_pt_640x192.zip) | Mono | No | 640 x 192 | 0.132 | 0.845 |
| [`stereo_no_pt_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_no_pt_640x192.zip) | Stereo | No | 640 x 192 | 0.130 | 0.831 |
| [`mono+stereo_no_pt_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_no_pt_640x192.zip) | Mono + Stereo | No | 640 x 192 | 0.127 | 0.836 |
You can also download models trained on the odometry split with [monocular](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_odom_640x192.zip) and [mono+stereo](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_odom_640x192.zip) training modalities.
Finally, we provide resnet 50 depth estimation models trained with [ImageNet pretrained weights](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_resnet50_640x192.zip) and [trained from scratch](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_resnet50_no_pt_640x192.zip).
Make sure to set `--num_layers 50` if using these.
## ð¾ KITTI training data
You can download the entire [raw KITTI dataset](http://www.cvlibs.net/datasets/kitti/raw_data.php) by running:
```shell
wget -i splits/kitti_archives_to_download.txt -P kitti_data/
```
Then unzip with
```shell
cd kitti_data
unzip "*.zip"
cd ..
```
**Warning:** it weighs about **175GB**, so make sure you have enough space to unzip too!
Our default settings expect that you have converted the png images to jpeg with this command, **which also deletes the raw KITTI `.png` files**:
```shell
find kitti_data/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2,1x1,1x1 {.}.png {.}.jpg && rm {}'
```
**or** you can skip this conversion step and train from raw png files by adding the flag `--png` when training, at the expense of slower load times.
The above conversion command creates images which match our experiments, where KITTI `.png` images were converted to `.jpg` on Ubuntu 16.04 with default chroma subsampling `2x2,1x1,1x1`.
We found that Ubuntu 18.04 defaults to `2x2,2x2,2x2`, which gives different results, hence the explicit parameter in the conversion command.
You can also place the KITTI dataset wherever you like and point towards it with the `--data_path` flag during training and evaluation.
**Splits**
The train/test/validation splits are defined in the `splits/` folder.
By default, the code will train a depth model using [Zhou's subset](https://github.com/tinghuiz/SfMLearner) of the standard Eigen split of KITTI, which is designed for monocular training.
You can also train a model using the new [benchmark split](http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction) or the [odometry split](http://www.cvlibs.net/datasets/kitti/eval_odometry.php) by setting the `--split` flag.
**Custom dataset**
You can train on a custom monocular or stereo dataset by writing a new dataloader class which inherits from `MonoDataset` â see the `KITTIDataset` class in `datasets/kitti_dataset.py` for an example.
## â³ Training
By default models and tensorboard event files are saved to `~/tmp/<model_name>`.
This can be changed with the `--log_dir` flag.
**Monocular training:**
```shell
python train.py --model_name mono_model
```
**Stereo training:**
Our code defaults to using Zhou's subsampled Eigen training data. For stereo-only training we have to specify that we want to use the full Eigen training set â see paper for details.
``
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
【资源说明】 基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip 基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip 【备注】 1.项目代码均经过功能验证ok,确保稳定可靠运行。欢迎下载使用体验! 2.主要针对各个计算机相关专业,包括计算机科学、信息安全、数据科学与大数据技术、人工智能、通信、物联网等领域的在校学生、专业教师、企业员工。 3.项目具有丰富的拓展空间,不仅可作为入门进阶,也可直接作为毕设、课程设计、大作业、初期项目立项演示等用途。 4.当然也鼓励大家基于此进行二次开发。在使用过程中,如有问题或建议,请及时沟通。 5.期待你能在项目中找到乐趣和灵感,也欢迎你的分享和反馈!
资源推荐
资源详情
资源评论
收起资源包目录
基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip (45个子文件)
utils.py 4KB
Intro Link.txt 76B
export_gt_depth.py 2KB
networks
__init__.py 150B
pose_cnn.py 1KB
resnet_encoder.py 4KB
pose_decoder.py 2KB
depth_decoder.py 2KB
assets
teaser.gif 8.81MB
copyright_notice.txt 300B
test_image.jpg 82KB
experiments
stereo_experiments.sh 972B
odom_experiments.sh 975B
mono+stereo_experiments.sh 1KB
mono_experiments.sh 1KB
layers.py 8KB
Thesis Define.pptx 23.19MB
trainer.py 25KB
datasets
__init__.py 80B
mono_dataset.py 7KB
kitti_dataset.py 5KB
kitti_utils.py 3KB
evaluate_depth.py 8KB
depth_prediction_example.ipynb 357KB
Monocular 3D Reconstruction Based on Unsupervised Learning.pdf 1.39MB
options.py 10KB
train.py 507B
test_simple_rectified.py 5KB
splits
eigen_benchmark
test_files.txt 28KB
benchmark
train_files.txt 3.04MB
eigen_to_benchmark_ids.npy 5KB
test_files.txt 5KB
val_files.txt 258KB
odom
test_files_09.txt 13KB
train_files.txt 307KB
test_files_10.txt 11KB
val_files.txt 34KB
kitti_archives_to_download.txt 7KB
eigen_zhou
train_files.txt 1.68MB
val_files.txt 191KB
eigen_full
train_files.txt 1.91MB
val_files.txt 77KB
eigen
test_files.txt 35KB
evaluate_pose.py 5KB
README.md 14KB
共 45 条
- 1
资源评论
- xiaowei897562024-04-22感谢资源主分享的资源解决了我当下的问题,非常有用的资源。
- wy1225_10042024-05-20支持这个资源,内容详细,主要是能解决当下的问题,感谢大佬分享~
- 王不否2024-04-12资源是宝藏资源,实用也是真的实用,感谢大佬分享~
.whl
- 粉丝: 3960
- 资源: 4908
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- springboot713校园志愿者管理系统--论文.zip
- springboot715桥牌计分系统.zip
- springboot716沁园健身房预约管理系统.zip
- springboot714校园疫情防控系统--论文.zip
- springboot719生鲜超市管理的设计与实现.zip
- springboot718生鲜交易系统--论文.zip
- springboot717游戏分享网站--论文.zip
- springboot721疫苗接种管理系统.zip
- 基于JavaScript的Sewise-Player播放器设计源码下载
- springboot722社区疫情防控平台.zip
- springboot720疫情防控期间某村外出务工人员信息管理系统--论文.zip
- springboot723福聚苑社区团购.zip
- springboot725篮球论坛系统--论文.zip
- springboot724篮球竞赛预约平台--论文.zip
- 基于微信小程序的汽车车行保养维修小程序设计源码
- springboot726线上买菜系统.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功