Feature Selective Anchor-Free Module for Single-Shot Object Detection
FSAF is an anchor-free method published in CVPR2019 (https://arxiv.org/pdf/1903.00621.pdf).
Actually it is equivalent to the anchor-based method with only one anchor at each feature map position in each FPN level.
And this is how we implemented it.
Only the anchor-free branch is released for its better compatibility with the current framework and less computational budget.
In the original paper, feature maps within the central 0.2-0.5 area of a gt box are tagged as ignored. However,
it is empirically found that a hard threshold (0.2-0.2) gives a further gain on the performance. (see the table below)
Main Results
Results on R50/R101/X101-FPN
Backbone |
ignore range |
ms-train |
Lr schd |
Train Mem (GB) |
Train time (s/iter) |
Inf time (fps) |
box AP |
Download |
R-50 |
0.2-0.5 |
N |
1x |
3.15 |
0.43 |
12.3 |
36.0 (35.9) |
model | log |
R-50 |
0.2-0.2 |
N |
1x |
3.15 |
0.43 |
13.0 |
37.4 |
model | log |
R-101 |
0.2-0.2 |
N |
1x |
5.08 |
0.58 |
10.8 |
39.3 (37.9) |
model | log |
X-101 |
0.2-0.2 |
N |
1x |
9.38 |
1.23 |
5.6 |
42.4 (41.0) |
model | log |
Notes:
- 1x means the model is trained for 12 epochs.
- AP values in the brackets represent those reported in the original paper.
- All results are obtained with a single model and single-scale test.
- X-101 backbone represents ResNext-101-64x4d.
- All pretrained backbones use pytorch style.
- All models are trained on 8 Titan-XP gpus and tested on a single gpu.
Citations
BibTeX reference is as follows.
@inproceedings{zhu2019feature,
title={Feature Selective Anchor-Free Module for Single-Shot Object Detection},
author={Zhu, Chenchen and He, Yihui and Savvides, Marios},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={840--849},
year={2019}
}