Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
hit_zyh 4ecda5fede | 3 years ago | |
---|---|---|
scripts | 3 years ago | |
src | 3 years ago | |
README.md | 3 years ago | |
eval.py | 3 years ago | |
export.py | 3 years ago | |
mindspore_hub_conf.py | 3 years ago | |
train.py | 3 years ago |
NTS-Net for Navigator-Teacher-Scrutinizer Network, consists of a Navigator agent, a Teacher agent and a Scrutinizer agent. In consideration of intrinsic consistency between informativeness of the regions and their probability being ground-truth class, NTS-Net designs a novel training paradigm, which enables Navigator to detect most informative regions under the guidance from Teacher. After that, the Scrutinizer scrutinizes the proposed regions from Navigator and makes predictions
Paper: Z. Yang, T. Luo, D. Wang, Z. Hu, J. Gao, and L. Wang, Learning to navigate for fine-grained classification, in Proceedings of the European Conference on Computer Vision (ECCV), 2018.
NTS-Net consists of a Navigator agent, a Teacher agent and a Scrutinizer agent. The Navigator navigates the model to focus on the most informative regions: for each region in the image, Navigator predicts how informative the region is, and the predictions are used to propose the most informative regions. The Teacher evaluates the regions proposed by Navigator and provides feedbacks: for each proposed region, the Teacher evaluates its probability belonging to ground-truth class; the confidence evaluations guide the Navigator to propose more informative regions with a novel ordering-consistent loss function. The Scrutinizer scrutinizes proposed regions from Navigator and makes fine-grained classifications: each proposed region is enlarged to the same size and the Scrutinizer extracts features therein; the features of regions and of the whole image are jointly processed to make fine-grained classifications.
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
Dataset used: Caltech-UCSD Birds-200-2011
Please download the datasets [CUB_200_2011.tgz] and unzip it, then put all training images into a directory named "train", put all testing images into a directory named "test".
The directory structure is as follows:
.
└─cub_200_2011
├─train
└─test
.
└─ntsnet
├─README.md # README
├─scripts # shell script
├─run_standalone_train.sh # training in standalone mode(1pcs)
├─run_distribute_train.sh # training in parallel mode(8 pcs)
└─run_eval.sh # evaluation
├─src
├─config.py # network configuration
├─dataset.py # dataset utils
├─lr_generator.py # leanring rate generator
├─network.py # network define for ntsnet
└─resnet.py # resnet.py
├─mindspore_hub_conf.py # mindspore hub interface
├─export.py # script to export MINDIR model
├─eval.py # evaluation scripts
└─train.py # training scripts
# distributed training
Usage: bash run_train.sh [RANK_TABLE_FILE] [DATA_URL] [TRAIN_URL]
# standalone training
Usage: bash run_standalone_train.sh [DATA_URL] [TRAIN_URL]
"img_width": 448, # width of the input images
"img_height": 448, # height of the input images
# anchor
"size": [48, 96, 192], #anchor base size
"scale": [1, 2 ** (1. / 3.), 2 ** (2. / 3.)], #anchor base scale
"aspect_ratio": [0.667, 1, 1.5], #anchor base aspect_ratio
"stride": [32, 64, 128], #anchor base stride
# resnet
"resnet_block": [3, 4, 6, 3], # block number in each layer
"resnet_in_channels": [64, 256, 512, 1024], # in channel size for each layer
"resnet_out_channels": [256, 512, 1024, 2048], # out channel size for each layer
# LR
"base_lr": 0.001, # base learning rate
"base_step": 58633, # bsae step in lr generator
"total_epoch": 200, # total epoch in lr generator
"warmup_step": 4, # warmp up step in lr generator
"sgd_momentum": 0.9, # momentum in optimizer
# train
"batch_size": 8,
"weight_decay": 1e-4,
"epoch_size": 200, # total epoch size
"save_checkpoint": True, # whether save checkpoint or not
"save_checkpoint_epochs": 1, # save checkpoint interval
"num_classes": 200
config.py
, including learning rate, output filename and network hyperparameters. Click here for more information about dataset.run_standalone_train.sh
for non-distributed training of NTS-Net model.# standalone training
bash run_standalone_train.sh [DATA_URL] [TRAIN_URL]
run_distribute_train.sh
for distributed training of NTS-Net model.bash run_train.sh [RANK_TABLE_FILE] [DATA_URL] [TRAIN_URL]
Training result will be stored in train_url path. You can find checkpoint file together with result like the following in loss.log.
# distribute training result(8p)
epoch: 1 step: 750 ,loss: 30.88018
epoch: 2 step: 750 ,loss: 26.73352
epoch: 3 step: 750 ,loss: 22.76208
epoch: 4 step: 750 ,loss: 20.52259
epoch: 5 step: 750 ,loss: 19.34843
epoch: 6 step: 750 ,loss: 17.74093
run_eval.sh
for evaluation.# infer
sh run_eval.sh [DATA_URL] [TRAIN_URL] [CKPT_FILENAME]
Inference result will be stored in the train_url path. Under this, you can find result like the following in eval.log.
ckpt file name: ntsnet-112_750.ckpt
accuracy: 0.876
python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT]
EXPORT_FORMAT
should be "MINDIR"
Parameters | Ascend |
---|---|
Model Version | V1 |
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory, 755G |
uploaded Date | 16/04/2021 (month/day/year) |
MindSpore Version | 1.1.1 |
Dataset | cub200-2011 |
Training Parameters | epoch=200, batch_size = 8 |
Optimizer | SGD |
Loss Function | Softmax Cross Entropy |
Output | predict class |
Loss | 10.9852 |
Speed | 1pc: 130 ms/step; 8pcs: 138 ms/step |
Total time | 8pcs: 5.93 hours |
Parameters | 87.6 |
Checkpoint for Fine tuning | 333.07M(.ckpt file) |
Scripts | ntsnet script |
We use random seed in train.py and eval.py for weight initialization.
Please check the official homepage.
NTS-Net通过多代理合作机制实现对于细粒度图像分类的关键区域定位,通过FPN网络引入信息量,并基于信息量和置信度的排序loss来指导定位到关键anchor。
Python Shell
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》