基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip

共45个文件

py：18个

txt：16个

sh：4个

版权申诉

毕业设计

课程大作业

无监督学习

5星 · 超过95%的资源 19 浏览量 2023-12-28 18:53:39 上传评论 3 收藏 34.44MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于无监督学习模型Monodepth2实现单目视觉物体三维重建python源码+数据+答辩PPT.zip （45个子文件）

utils.py 4KB

Intro Link.txt 76B

export_gt_depth.py 2KB

networks

__init__.py 150B

pose_cnn.py 1KB

resnet_encoder.py 4KB

pose_decoder.py 2KB

depth_decoder.py 2KB

assets

teaser.gif 8.81MB

copyright_notice.txt 300B

test_image.jpg 82KB

experiments

stereo_experiments.sh 972B

odom_experiments.sh 975B

mono+stereo_experiments.sh 1KB

mono_experiments.sh 1KB

layers.py 8KB

Thesis Define.pptx 23.19MB

trainer.py 25KB

datasets

__init__.py 80B

mono_dataset.py 7KB

kitti_dataset.py 5KB

kitti_utils.py 3KB

evaluate_depth.py 8KB

depth_prediction_example.ipynb 357KB

Monocular 3D Reconstruction Based on Unsupervised Learning.pdf 1.39MB

options.py 10KB

train.py 507B

test_simple_rectified.py 5KB

splits

eigen_benchmark

test_files.txt 28KB

benchmark

train_files.txt 3.04MB

eigen_to_benchmark_ids.npy 5KB

test_files.txt 5KB

val_files.txt 258KB

odom

test_files_09.txt 13KB

train_files.txt 307KB

test_files_10.txt 11KB

val_files.txt 34KB

kitti_archives_to_download.txt 7KB

eigen_zhou

train_files.txt 1.68MB

val_files.txt 191KB

eigen_full

train_files.txt 1.91MB

val_files.txt 77KB

eigen

test_files.txt 35KB

evaluate_pose.py 5KB

README.md 14KB

This is the reference PyTorch implementation for training and testing depth estimation models using the method described in > **Digging into Self-Supervised Monocular Depth Prediction** > > [ClÃ©ment Godard](http://www0.cs.ucl.ac.uk/staff/C.Godard/), [Oisin Mac Aodha](http://vision.caltech.edu/~macaodha/), [Michael Firman](http://www.michaelfirman.co.uk) and [Gabriel J. Brostow](http://www0.cs.ucl.ac.uk/staff/g.brostow/) > > [ICCV 2019 (arXiv pdf)](https://arxiv.org/abs/1806.01260) <p align="center"> <img src="assets/teaser.gif" alt="example input output gif" width="600" /> </p> This code is for non-commercial use; please see the [license file](LICENSE) for terms. If you find our work useful in your research please consider citing our paper: ``` @article{monodepth2, title = {Digging into Self-Supervised Monocular Depth Prediction}, author = {Cl{\'{e}}ment Godard and Oisin {Mac Aodha} and Michael Firman and Gabriel J. Brostow}, booktitle = {The International Conference on Computer Vision (ICCV)}, month = {October}, year = {2019} } ``` ## âï¸ Setup Assuming a fresh [Anaconda](https://www.anaconda.com/download/) distribution, you can install the dependencies with: ```shell conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch pip install tensorboardX==1.4 conda install opencv=3.3.1 # just needed for evaluation ``` We ran our experiments with PyTorch 0.4.1, CUDA 9.1, Python 3.6.6 and Ubuntu 18.04. We have also successfully trained models with PyTorch 1.0, and our code is compatible with Python 2.7. You may have issues installing OpenCV version 3.3.1 if you use Python 3.7, we recommend to create a virtual environment with Python 3.6.6 `conda create -n monodepth2 python=3.6.6 anaconda `.  ## ð¼ï¸ Prediction for a single image You can predict scaled disparity for a single image with: ```shell python test_simple.py --image_path assets/test_image.jpg --model_name mono+stereo_640x192 ``` or, if you are using a stereo-trained model, you can estimate metric depth with ```shell python test_simple.py --image_path assets/test_image.jpg --model_name mono+stereo_640x192 --pred_metric_depth ``` On its first run either of these commands will download the `mono+stereo_640x192` pretrained model (99MB) into the `models/` folder. We provide the following options for `--model_name`: | `--model_name` | Training modality | Imagenet pretrained? | Model resolution | KITTI abs. rel. error | delta < 1.25 | |-------------------------|-------------------|--------------------------|-----------------|------|----------------| | [`mono_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_640x192.zip) | Mono | Yes | 640 x 192 | 0.115 | 0.877 | | [`stereo_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_640x192.zip) | Stereo | Yes | 640 x 192 | 0.109 | 0.864 | | [`mono+stereo_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_640x192.zip) | Mono + Stereo | Yes | 640 x 192 | 0.106 | 0.874 | | [`mono_1024x320`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_1024x320.zip) | Mono | Yes | 1024 x 320 | 0.115 | 0.879 | | [`stereo_1024x320`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_1024x320.zip) | Stereo | Yes | 1024 x 320 | 0.107 | 0.874 | | [`mono+stereo_1024x320`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_1024x320.zip) | Mono + Stereo | Yes | 1024 x 320 | 0.106 | 0.876 | | [`mono_no_pt_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_no_pt_640x192.zip) | Mono | No | 640 x 192 | 0.132 | 0.845 | | [`stereo_no_pt_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_no_pt_640x192.zip) | Stereo | No | 640 x 192 | 0.130 | 0.831 | | [`mono+stereo_no_pt_640x192`](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_no_pt_640x192.zip) | Mono + Stereo | No | 640 x 192 | 0.127 | 0.836 | You can also download models trained on the odometry split with [monocular](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_odom_640x192.zip) and [mono+stereo](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_odom_640x192.zip) training modalities. Finally, we provide resnet 50 depth estimation models trained with [ImageNet pretrained weights](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_resnet50_640x192.zip) and [trained from scratch](https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_resnet50_no_pt_640x192.zip). Make sure to set `--num_layers 50` if using these. ## ð¾ KITTI training data You can download the entire [raw KITTI dataset](http://www.cvlibs.net/datasets/kitti/raw_data.php) by running: ```shell wget -i splits/kitti_archives_to_download.txt -P kitti_data/ ``` Then unzip with ```shell cd kitti_data unzip "*.zip" cd .. ``` **Warning:** it weighs about **175GB**, so make sure you have enough space to unzip too! Our default settings expect that you have converted the png images to jpeg with this command, **which also deletes the raw KITTI `.png` files**: ```shell find kitti_data/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2,1x1,1x1 {.}.png {.}.jpg && rm {}' ``` **or** you can skip this conversion step and train from raw png files by adding the flag `--png` when training, at the expense of slower load times. The above conversion command creates images which match our experiments, where KITTI `.png` images were converted to `.jpg` on Ubuntu 16.04 with default chroma subsampling `2x2,1x1,1x1`. We found that Ubuntu 18.04 defaults to `2x2,2x2,2x2`, which gives different results, hence the explicit parameter in the conversion command. You can also place the KITTI dataset wherever you like and point towards it with the `--data_path` flag during training and evaluation. **Splits** The train/test/validation splits are defined in the `splits/` folder. By default, the code will train a depth model using [Zhou's subset](https://github.com/tinghuiz/SfMLearner) of the standard Eigen split of KITTI, which is designed for monocular training. You can also train a model using the new [benchmark split](http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction) or the [odometry split](http://www.cvlibs.net/datasets/kitti/eval_odometry.php) by setting the `--split` flag. **Custom dataset** You can train on a custom monocular or stereo dataset by writing a new dataloader class which inherits from `MonoDataset` â see the `KITTIDataset` class in `datasets/kitti_dataset.py` for an example. ## â³ Training By default models and tensorboard event files are saved to `~/tmp/<model_name>`. This can be changed with the `--log_dir` flag. **Monocular training:** ```shell python train.py --model_name mono_model ``` **Stereo training:** Our code defaults to using Zhou's subsampled Eigen training data. For stereo-only training we have to specify that we want to use the full Eigen training set â see paper for details. ``

评论收藏

内容反馈

版权申诉