Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
zjuter0126 4e762fe5f6 | 2 years ago | |
---|---|---|
scripts | 2 years ago | |
src | 2 years ago | |
README.md | 2 years ago | |
data_prepare.py | 2 years ago | |
eval.py | 2 years ago | |
export.py | 2 years ago | |
mindspore_hub_conf.py | 2 years ago | |
requirements.txt | 2 years ago | |
train.py | 2 years ago |
NTS-Net for Navigator-Teacher-Scrutinizer Network, consists of a Navigator agent, a Teacher agent and a Scrutinizer agent. In consideration of intrinsic consistency between informativeness of the regions and their probability being ground-truth class, NTS-Net designs a novel training paradigm, which enables Navigator to detect most informative regions under the guidance from Teacher. After that, the Scrutinizer scrutinizes the proposed regions from Navigator and makes predictions
Paper: Z. Yang, T. Luo, D. Wang, Z. Hu, J. Gao, and L. Wang, Learning to navigate for fine-grained classification, in Proceedings of the European Conference on Computer Vision (ECCV), 2018.
NTS-Net consists of a Navigator agent, a Teacher agent and a Scrutinizer agent. The Navigator navigates the model to focus on the most informative regions: for each region in the image, Navigator predicts how informative the region is, and the predictions are used to propose the most informative regions. The Teacher evaluates the regions proposed by Navigator and provides feedbacks: for each proposed region, the Teacher evaluates its probability belonging to ground-truth class; the confidence evaluations guide the Navigator to propose more informative regions with a novel ordering-consistent loss function. The Scrutinizer scrutinizes proposed regions from Navigator and makes fine-grained classifications: each proposed region is enlarged to the same size and the Scrutinizer extracts features therein; the features of regions and of the whole image are jointly processed to make fine-grained classifications.
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
Dataset used: Caltech-UCSD Birds-200-2011
Please download the datasets [CUB_200_2011.tgz] and unzip it, then put all training images into a directory named "train", put all testing images into a directory named "test".
The directory structure is as follows:
├─resnet50.ckpt
└─cub_200_2011
├─train
└─test
.
└─ntsnet
├─README.md # README
├─scripts # shell script
├─run_standalone_train.sh # training in standalone mode(1pcs)
├─run_distribute_train.sh # training in parallel mode(8 pcs)
└─run_eval.sh # evaluation
├─src
├─config.py # network configuration
├─dataset.py # dataset utils
├─lr_generator.py # leanring rate generator
├─network.py # network define for ntsnet
└─resnet.py # resnet.py
├─mindspore_hub_conf.py # mindspore hub interface
├─export.py # script to export MINDIR model
├─eval.py # evaluation scripts
└─train.py # training scripts
# distributed training
Usage: bash run_train.sh [RANK_TABLE_FILE] [DATA_URL] [TRAIN_URL]
# standalone training
Usage: bash run_standalone_train.sh [DATA_URL] [TRAIN_URL]
"img_width": 448, # width of the input images
"img_height": 448, # height of the input images
# anchor
"size": [48, 96, 192], #anchor base size
"scale": [1, 2 ** (1. / 3.), 2 ** (2. / 3.)], #anchor base scale
"aspect_ratio": [0.667, 1, 1.5], #anchor base aspect_ratio
"stride": [32, 64, 128], #anchor base stride
# resnet
"resnet_block": [3, 4, 6, 3], # block number in each layer
"resnet_in_channels": [64, 256, 512, 1024], # in channel size for each layer
"resnet_out_channels": [256, 512, 1024, 2048], # out channel size for each layer
# LR
"base_lr": 0.001, # base learning rate
"base_step": 58633, # bsae step in lr generator
"total_epoch": 200, # total epoch in lr generator
"warmup_step": 4, # warmp up step in lr generator
"sgd_momentum": 0.9, # momentum in optimizer
# train
"batch_size": 8, # 16 for gpu
"weight_decay": 1e-4,
"epoch_size": 200, # total epoch size
"save_checkpoint": True, # whether save checkpoint or not
"save_checkpoint_epochs": 1, # save checkpoint interval
"num_classes": 200,
"lr_scheduler": "cosine", # lr_scheduler, support cosine or step
"optimizer": "momentum"
config.py
, including learning rate, output filename and network hyperparameters. Click here for more information about dataset.run_standalone_train_ascend.sh
for non-distributed training of NTS-Net model in Ascend.# standalone training in ascend
bash run_standalone_train_ascend.sh [DATA_URL] [TRAIN_URL] [DEVICE_ID(optional)]
run_standalone_train_gpu.sh
for non-distributed training of NTS-Net model in GPU.# standalone training in gpu
bash run_standalone_train_gpu.sh [DATA_URL] [TRAIN_URL] [DEVICE_ID(optional)]
run_distribute_train_ascend.sh
for distributed training of NTS-Net model in Ascend.bash run_distribute_train_ascend.sh [RANK_TABLE_FILE] [DATA_URL] [TRAIN_URL]
run_distribute_train_gpu.sh
for distributed training of NTS-Net model in GPU.bash run_distribute_train_gpu.sh [DEVICE_NUM] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [DATA_URL] [TRAIN_URL]
Training result will be stored in train_url path. You can find checkpoint file together with result like the following in loss.log.
# distribute training result(8p)
epoch: 1 step: 750 ,loss: 30.88018
epoch: 2 step: 750 ,loss: 26.73352
epoch: 3 step: 750 ,loss: 22.76208
epoch: 4 step: 750 ,loss: 20.52259
epoch: 5 step: 750 ,loss: 19.34843
epoch: 6 step: 750 ,loss: 17.74093
run_eval_ascend.sh
for evaluation.# infer on Ascend
sh run_eval_ascend.sh [DATA_URL] [TRAIN_URL] [CKPT_FILENAME] [DEVICE_ID(optional)]
run_eval_gpu.sh
for evaluation.# infer on GPU
sh run_eval_gpu.sh [DATA_URL] [TRAIN_URL] [CKPT_FILENAME] [DEVICE_ID(optional)]
Inference result will be stored in the train_url path. Under this, you can find result like the following in eval.log.
ckpt file name: ntsnet-112_750.ckpt
accuracy: 0.876
python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format [EXPORT_FORMAT]
EXPORT_FORMAT
should be "MINDIR"
Parameters | Ascend | Telsa V100-PCIE |
---|---|---|
Model Version | V1 | V1 |
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G | TeslaA100; CPU 2.3GHz, 40cores; Memory 377G |
uploaded Date | 16/04/2021 (day/month/year) | 05/10/2021 (day/month/year) |
MindSpore Version | 1.1.1 | 1.5.0rc1 |
Dataset | cub200-2011 | cub200-2011 |
Training Parameters | epoch=200, batch_size = 8 | epoch=200, batch_size = 16 |
Optimizer | SGD | Momentum |
Loss Function | Softmax Cross Entropy | Softmax Cross Entropy |
Output | predict class | predict class |
Loss | 10.9852 | 12.195317 |
Speed | 1pc: 130 ms/step; 8pcs: 138 ms/step | 1pc: 480 ms/step; |
Total time | 8pcs: 5.93 hours | |
Parameters | 87.6 | 87.5 |
Checkpoint for Fine tuning | 333.07M(.ckpt file) | 222.03(.ckpt file) |
Scripts | ntsnet script | ntsnet script |
We use random seed in train.py and eval.py for weight initialization.
Please check the official homepage.
First refer to ModelZoo FAQ to find some common public questions.
细粒度分类具有挑战性,因为很难找到有区别的特征。找到那些能够完全描述物体的细微特征并不容易。为了解决这一问题,我们提出了一种新的自监督机制来有效地对信息区域进行定位,而不需要使用框/部件标注。我们的模型NTS-Net称为导航-教学-审查网络,由导航器代理、教学器代理和审查器代理组成。考虑到区域的信息量与其为groundtruth类的概率之间的内在一致性,设计了一种新的训练范式,使导航器能够在教学器的指导下检测出信息量最大的区域。然后,审查器从导航器中仔细识别建议的区域并做出预测。我们的模型可以看作是一个
Python Shell
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》