Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
wuliChengKun eba0b720e8 | 1 year ago | |
---|---|---|
scripts | 1 year ago | |
utils | 1 year ago | |
.gitignore | 1 year ago | |
LICENSE | 1 year ago | |
README.md | 1 year ago | |
convert.py | 1 year ago | |
demo.py | 1 year ago | |
hubconf.py | 1 year ago | |
main.py | 1 year ago |
This is an unofficial PyTorch implementation of DeepLab v2 [1] with a ResNet-101 backbone.
torch.hub
is supported.Train set | Eval set | Code | Weight | CRF? | Pixel Accuracy |
Mean Accuracy |
Mean IoU | FreqW IoU |
---|---|---|---|---|---|---|---|---|
10k train † | 10k val † | Official [2] | 65.1 | 45.5 | 34.4 | 50.4 | ||
This repo | Download | 65.8 | 45.7 | 34.8 | 51.2 | |||
✓ | 67.1 | 46.4 | 35.6 | 52.5 | ||||
164k train | 164k val | This repo | Download ‡ | 66.8 | 51.2 | 39.1 | 51.5 | |
✓ | 67.6 | 51.5 | 39.7 | 52.3 |
† Images and labels are pre-warped to square-shape 513x513
‡ Note for SPADE followers: The provided COCO-Stuff 164k weight has been kept intact since 2019/02/23.
Train set | Eval set | Code | Weight | CRF? | Pixel Accuracy |
Mean Accuracy |
Mean IoU | FreqW IoU |
---|---|---|---|---|---|---|---|---|
trainaug | val | Official [3] | - | - | 76.35 | - | ||
✓ | - | - | 77.69 | - | ||||
This repo | Download | 94.64 | 86.50 | 76.65 | 90.41 | |||
✓ | 95.04 | 86.64 | 77.93 | 91.06 |
Required Python packages are listed in the Anaconda configuration file configs/conda_env.yaml
.
Please modify the listed cudatoolkit=10.2
and python=3.6
as needed and run the following commands.
# Set up with Anaconda
conda env create -f configs/conda_env.yaml
conda activate deeplab-pytorch
Caffemodels pre-trained on COCO and PASCAL VOC datasets are released by the DeepLab authors.
In accordance with the papers [1,2], this repository uses the COCO-trained parameters as initial weights.
$ bash scripts/setup_caffemodels.sh
# Generate "deeplabv1_resnet101-coco.pth" from "init.caffemodel"
$ python convert.py --dataset coco
# Generate "deeplabv2_resnet101_msc-vocaug.pth" from "train2_iter_20000.caffemodel"
$ python convert.py --dataset voc12
To train DeepLab v2 on PASCAL VOC 2012:
python main.py train \
--config-path configs/voc12.yaml
To evaluate the performance on a validation set:
python main.py test \
--config-path configs/voc12.yaml \
--model-path data/models/voc12/deeplabv2_resnet101_msc/train_aug/checkpoint_final.pth
Note: This command saves the predicted logit maps (.npy
) and the scores (.json
).
To re-evaluate with a CRF post-processing:
python main.py crf \
--config-path configs/voc12.yaml
Execution of a series of the above scripts is equivalent to bash scripts/train_eval.sh
.
To monitor a loss, run the following command in a separate terminal.
tensorboard --logdir data/logs
Please specify the appropriate configuration files for the other datasets.
Dataset | Config file | #Iterations | Classes |
---|---|---|---|
PASCAL VOC 2012 | configs/voc12.yaml |
20,000 | 20 foreground + 1 background |
COCO-Stuff 10k | configs/cocostuff10k.yaml |
20,000 | 182 thing/stuff |
COCO-Stuff 164k | configs/cocostuff164k.yaml |
100,000 | 182 thing/stuff |
Note: Although the label indices range from 0 to 181 in COCO-Stuff 10k/164k, only 171 classes are supervised.
Common settings:
CUDA_VISIBLE_DEVICES=
.batch_size * iter_size = 10
). GPU memory usage is approx. 11.2 GB with the default setting (tested on the single Titan X). You can reduce it with a small batch_size
.(1-iter/iter_max)**power
at every 10 iterations.average_loss
in Caffe) can be monitored in TensorBoard.Processed images and labels in COCO-Stuff 164k:
You can use the pre-trained models, the converted models, or your models.
To process a single image:
python demo.py single \
--config-path configs/voc12.yaml \
--model-path deeplabv2_resnet101_msc-vocaug-20000.pth \
--image-path image.jpg
To run on a webcam:
python demo.py live \
--config-path configs/voc12.yaml \
--model-path deeplabv2_resnet101_msc-vocaug-20000.pth
To run a CRF post-processing, add --crf
. To run on a CPU, add --cpu
.
Model setup with two lines
import torch.hub
model = torch.hub.load("kazuto1011/deeplab-pytorch", "deeplabv2_resnet101", pretrained='cocostuff164k', n_classes=182)
Interp
layer) for downsampling a label for only 0.5x input, this codebase does for both 0.5x and 0.75x inputs with nearest interpolation (PIL.Image.resize
, related issue).align_corners=False
.This codebase only supports DeepLab v2 training which freezes batch normalization layers, although
v3/v3+ protocols require training them. If training their parameters on multiple GPUs as well in your projects, please
install the extra library below.
pip install torch-encoding
Batch normalization layers in a model are automatically switched in libs/models/resnet.py
.
try:
from encoding.nn import SyncBatchNorm
_BATCH_NORM = SyncBatchNorm
except:
_BATCH_NORM = nn.BatchNorm2d
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. DeepLab: Semantic Image
Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE TPAMI,
2018.
Project /
Code / arXiv
paper
H. Caesar, J. Uijlings, V. Ferrari. COCO-Stuff: Thing and Stuff Classes in Context. In CVPR, 2018.
Project / arXiv paper
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman. The PASCAL Visual Object
Classes (VOC) Challenge. IJCV, 2010.
Project /
Paper
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》