Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
jeffding ce74a2ab6f | 1 year ago | |
---|---|---|
.. | ||
src | 1 year ago | |
README.md | 1 year ago | |
data_prepro.py | 1 year ago | |
eval.py | 1 year ago | |
export.py | 1 year ago | |
mindspore_hub_conf.py | 1 year ago | |
mydataset.py | 1 year ago | |
myeval.py | 1 year ago | |
requirements.txt | 1 year ago | |
test.py | 1 year ago | |
train.py | 1 year ago |
此工程主体结构代码来源于华为deeplabv3是实现,根据竞赛内容我们对部分代码进行修改。详细操作流程见《DeepLabV3网络基线版本自验报告》文档。
DeepLab is a series of image semantic segmentation models, DeepLabV3 improves significantly over previous versions. Two keypoints of DeepLabV3: Its multi-grid atrous convolution makes it better to deal with segmenting objects at multiple scales, and augmented ASPP makes image-level features available to capture long range information.
This repository provides a script and recipe to DeepLabV3 model and achieve state-of-the-art performance.
Refer to this paper for network details.
Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv:1706.05587, 2017.
Resnet101 as backbone, atrous convolution for dense feature extraction.
Pascal VOC datasets and Semantic Boundaries Dataset
Download segmentation dataset.
Prepare the training data list file. The list file saves the relative path to image and annotation pairs. Lines are like:
JPEGImages/00001.jpg SegmentationClassGray/00001.png
JPEGImages/00002.jpg SegmentationClassGray/00002.png
JPEGImages/00003.jpg SegmentationClassGray/00003.png
JPEGImages/00004.jpg SegmentationClassGray/00004.png
......
Configure and run build_data.sh to convert dataset to mindrecords. Arguments in scripts/build_data.sh:
--data_root root path of training data
--data_lst list of training data(prepared above)
--dst_path where mindrecords are saved
--num_shards number of shards of the mindrecords
--shuffle shuffle or not
The mixed precision training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data types, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
Hardware(Ascend)
Framework
For more information, please check the resources below:
Install python packages in requirements.txt
Generate config json file for 8pcs training
# From the root of this project
cd src/tools/
python3 get_multicards_json.py 10.111.*.*
# 10.111.*.* is the computer's ip address.
After installing MindSpore via the official website, you can start training and evaluation as follows:
Based on original DeepLabV3 paper, we reproduce two training experiments on vocaug (also as trainaug) dataset and evaluate on voc val dataset.
For single device training, please config parameters, training script is:
run_standalone_train.sh
For 8 devices training, training steps are as follows:
run_distribute_train_s16_r1.sh
run_distribute_train_s8_r1.sh
run_distribute_train_s8_r2.sh
For evaluation, evaluating steps are as follows:
run_eval_s16.sh
run_eval_s8.sh
run_eval_s8_multiscale.sh
run_eval_s8_multiscale_flip.sh
.
└──deeplabv3
├── README.md
├── scripts
├── build_data.sh # convert raw data to mindrecord dataset
├── run_distribute_train_s16_r1.sh # launch ascend distributed training(8 pcs) with vocaug dataset in s16 structure
├── run_distribute_train_s8_r1.sh # launch ascend distributed training(8 pcs) with vocaug dataset in s8 structure
├── run_distribute_train_s8_r2.sh # launch ascend distributed training(8 pcs) with voctrain dataset in s8 structure
├── run_eval_s16.sh # launch ascend evaluation in s16 structure
├── run_eval_s8.sh # launch ascend evaluation in s8 structure
├── run_eval_s8_multiscale.sh # launch ascend evaluation with multiscale in s8 structure
├── run_eval_s8_multiscale_filp.sh # launch ascend evaluation with multiscale and filp in s8 structure
├── run_standalone_train.sh # launch ascend standalone training(1 pc)
├── src
├── data
├── dataset.py # mindrecord data generator
├── build_seg_data.py # data preprocessing
├── loss
├── loss.py # loss definition for deeplabv3
├── nets
├── deeplab_v3
├── deeplab_v3.py # DeepLabV3 network structure
├── net_factory.py # set S16 and S8 structures
├── tools
├── get_multicards_json.py # get rank table file
└── utils
└── learning_rates.py # generate learning rate
├── eval.py # eval net
├── train.py # train net
└── requirements.txt # requirements file
Default configuration
"data_file":"/PATH/TO/MINDRECORD_NAME" # dataset path
"train_epochs":300 # total epochs
"batch_size":32 # batch size of input tensor
"crop_size":513 # crop size
"base_lr":0.08 # initial learning rate
"lr_type":cos # decay mode for generating learning rate
"min_scale":0.5 # minimum scale of data argumentation
"max_scale":2.0 # maximum scale of data argumentation
"ignore_label":255 # ignore label
"num_classes":21 # number of classes
"model":deeplab_v3_s16 # select model
"ckpt_pre_trained":"/PATH/TO/PRETRAIN_MODEL" # path to load pretrain checkpoint
"is_distributed": # distributed training, it will be True if the parameter is set
"save_steps":410 # steps interval for saving
"keep_checkpoint_max":200 # max checkpoint for saving
Based on original DeepLabV3 paper, we reproduce two training experiments on vocaug (also as trainaug) dataset and evaluate on voc val dataset.
For single device training, please config parameters, training script is as follows:
# run_standalone_train.sh
python ${train_code_path}/train.py --data_file=/PATH/TO/MINDRECORD_NAME \
--train_dir=${train_path}/ckpt \
--train_epochs=200 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.015 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=deeplab_v3_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--save_steps=1500 \
--keep_checkpoint_max=200 >log 2>&1 &
For 8 devices training, training steps are as follows:
# run_distribute_train_s16_r1.sh
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=32 \
--crop_size=513 \
--base_lr=0.08 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=deeplab_v3_s16 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=410 \
--keep_checkpoint_max=200 >log 2>&1 &
done
# run_distribute_train_s8_r1.sh
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=800 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.02 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=deeplab_v3_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=820 \
--keep_checkpoint_max=200 >log 2>&1 &
done
# run_distribute_train_s8_r2.sh
for((i=0;i<=$RANK_SIZE-1;i++));
do
export RANK_ID=${i}
export DEVICE_ID=$((i + RANK_START_ID))
echo 'start rank='${i}', device id='${DEVICE_ID}'...'
mkdir ${train_path}/device${DEVICE_ID}
cd ${train_path}/device${DEVICE_ID} || exit
python ${train_code_path}/train.py --train_dir=${train_path}/ckpt \
--data_file=/PATH/TO/MINDRECORD_NAME \
--train_epochs=300 \
--batch_size=16 \
--crop_size=513 \
--base_lr=0.008 \
--lr_type=cos \
--min_scale=0.5 \
--max_scale=2.0 \
--ignore_label=255 \
--num_classes=21 \
--model=deeplab_v3_s8 \
--loss_scale=2048 \
--ckpt_pre_trained=/PATH/TO/PRETRAIN_MODEL \
--is_distributed \
--save_steps=110 \
--keep_checkpoint_max=200 >log 2>&1 &
done
# distribute training result(8p)
epoch: 1 step: 41, loss is 0.8319108
Epoch time: 213856.477, per step time: 5216.012
epoch: 2 step: 41, loss is 0.46052963
Epoch time: 21233.183, per step time: 517.883
epoch: 3 step: 41, loss is 0.45012417
Epoch time: 21231.951, per step time: 517.852
epoch: 4 step: 41, loss is 0.30687785
Epoch time: 21199.911, per step time: 517.071
epoch: 5 step: 41, loss is 0.22769661
Epoch time: 21240.281, per step time: 518.056
epoch: 6 step: 41, loss is 0.25470978
...
# distribute training result(8p)
epoch: 1 step: 82, loss is 0.024167
Epoch time: 322663.456, per step time: 3934.920
epoch: 2 step: 82, loss is 0.019832281
Epoch time: 43107.238, per step time: 525.698
epoch: 3 step: 82, loss is 0.021008959
Epoch time: 43109.519, per step time: 525.726
epoch: 4 step: 82, loss is 0.01912349
Epoch time: 43177.287, per step time: 526.552
epoch: 5 step: 82, loss is 0.022886964
Epoch time: 43095.915, per step time: 525.560
epoch: 6 step: 82, loss is 0.018708453
Epoch time: 43107.458, per step time: 525.701
...
# distribute training result(8p)
epoch: 1 step: 11, loss is 0.00554624
Epoch time: 199412.913, per step time: 18128.447
epoch: 2 step: 11, loss is 0.007181881
Epoch time: 6119.375, per step time: 556.307
epoch: 3 step: 11, loss is 0.004980865
Epoch time: 5996.978, per step time: 545.180
epoch: 4 step: 11, loss is 0.0047651967
Epoch time: 5987.412, per step time: 544.310
epoch: 5 step: 11, loss is 0.006262637
Epoch time: 5956.682, per step time: 541.517
epoch: 6 step: 11, loss is 0.0060750707
Epoch time: 5962.164, per step time: 542.015
...
Configure checkpoint with --ckpt_path and dataset path. Then run script, mIOU will be printed in eval_path/eval_log.
./run_eval_s16.sh # test s16
./run_eval_s8.sh # test s8
./run_eval_s8_multiscale.sh # test s8 + multiscale
./run_eval_s8_multiscale_flip.sh # test s8 + multiscale + flip
Example of test script is as follows:
python ${train_code_path}/eval.py --data_root=/PATH/TO/DATA \
--data_lst=/PATH/TO/DATA_lst.txt \
--batch_size=16 \
--crop_size=513 \
--ignore_label=255 \
--num_classes=21 \
--model=deeplab_v3_s8 \
--scales=0.5 \
--scales=0.75 \
--scales=1.0 \
--scales=1.25 \
--scales=1.75 \
--flip \
--freeze_bn \
--ckpt_path=/PATH/TO/PRETRAIN_MODEL >${eval_path}/eval_log 2>&1 &
Our result were obtained by running the applicable training script. To achieve the same results, follow the steps in the Quick Start Guide.
Network | OS=16 | OS=8 | MS | Flip | mIOU | mIOU in paper |
---|---|---|---|---|---|---|
deeplab_v3 | √ | 77.37 | 77.21 | |||
deeplab_v3 | √ | 78.84 | 78.51 | |||
deeplab_v3 | √ | √ | 79.70 | 79.45 | ||
deeplab_v3 | √ | √ | √ | 79.89 | 79.77 |
Note: There OS is output stride, and MS is multiscale.
Parameters | Ascend 910 |
---|---|
Model Version | DeepLabV3 |
Resource | Ascend 910 |
Uploaded Date | 09/04/2020 (month/day/year) |
MindSpore Version | 0.7.0-alpha |
Dataset | PASCAL VOC2012 + SBD |
Training Parameters | epoch = 300, batch_size = 32 (s16_r1) epoch = 800, batch_size = 16 (s8_r1) epoch = 300, batch_size = 16 (s8_r2) |
Optimizer | Momentum |
Loss Function | Softmax Cross Entropy |
Outputs | probability |
Loss | 0.0065883575 |
Speed | 60 ms/step(1pc, s16) 480 ms/step(8pcs, s16) 244 ms/step (8pcs, s8) |
Total time | 8pcs: 706 mins |
Parameters (M) | 58.2 |
Checkpoint for Fine tuning | 443M (.ckpt file) |
Model for inference | 223M (.air file) |
Scripts | Link |
Parameters | Ascend |
---|---|
Model Version | DeepLabV3 V1 |
Resource | Ascend 910 |
Uploaded Date | 09/04/2020 (month/day/year) |
MindSpore Version | 0.7.0-alpha |
Dataset | VOC datasets |
batch_size | 32 (s16); 16 (s8) |
outputs | probability |
Accuracy | 8pcs: s16: 77.37 s8: 78.84% s8_multiscale: 79.70% s8_Flip: 79.89% |
Model for inference | 443M (.ckpt file) |
In dataset.py, we set the seed inside "create_dataset" function. We also use random seed in train.py.
Please check the official homepage.
Huwei Ascend 910 Code
Text Unity3D Asset nesC Python Jupyter Notebook other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》