关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

History

Wenyu 3f7e70d40d add vit + mask rcnn (#7592 )		1 year ago
..
_base_	update vitdet yolo optim (#7567)	1 year ago

README.md	add vit + mask rcnn (#7592)	1 year ago

cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml	add_gt_as_proposals (#7218)	1 year ago

cascade_rcnn_vit_large_hrfpn_cae_1x_coco.yml	recompute flag (#6628)	1 year ago

faster_rcnn_vit_base_fpn_cae_1x_coco.yml	add vit + mask rcnn (#7592)	1 year ago

mask_rcnn_vit_base_hrfpn_cae_1x_coco.yml	[WIP] Add mask rcnn and yolo in vitdet (#7187)	1 year ago

mask_rcnn_vit_large_hrfpn_cae_1x_coco.yml	add vit + mask rcnn (#7592)	1 year ago

ppyoloe_vit_base_csppan_cae_36e_coco.yml	update vitdet yolo optim (#7567)	1 year ago

README.md

Vision Transformer Detection

Vision Transformer Detection

Introduction

Object detection is a central downstream task used to
test if pre-trained network parameters confer benefits, such
as improved accuracy or training speed. The complexity
of object detection methods can make this benchmarking
non-trivial when new architectures, such as Vision Transformer (ViT) models, arrive.

Model Zoo

Model	Backbone	Pretrained	Scheduler	Images/GPU	Box AP	Mask AP	Config	Download
Cascade RCNN	ViT-base	CAE	1x	1	52.7	-	config	model
Cascade RCNN	ViT-large	CAE	1x	1	55.7	-	config	model
PP-YOLOE	ViT-base	CAE	36e	2	52.2	-	config	model
Mask RCNN	ViT-base	CAE	1x	1	50.6	44.9	config	model
Mask RCNN	ViT-large	CAE	1x	1	54.2	47.4	config	model

Notes:

Model is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)
Base model is trained on 8x32G V100 GPU, large model on 8x80G A100
The Cascade RCNN experiments are based on PaddlePaddle 2.2.2

Citations

@article{chen2022context,
  title={Context autoencoder for self-supervised representation learning},
  author={Chen, Xiaokang and Ding, Mingyu and Wang, Xiaodi and Xin, Ying and Mo, Shentong and Wang, Yunhao and Han, Shumin and Luo, Ping and Zeng, Gang and Wang, Jingdong},
  journal={arXiv preprint arXiv:2202.03026},
  year={2022}
}

@article{DBLP:journals/corr/abs-2111-11429,
  author    = {Yanghao Li and
               Saining Xie and
               Xinlei Chen and
               Piotr Doll{\'{a}}r and
               Kaiming He and
               Ross B. Girshick},
  title     = {Benchmarking Detection Transfer Learning with Vision Transformers},
  journal   = {CoRR},
  volume    = {abs/2111.11429},
  year      = {2021},
  url       = {https://arxiv.org/abs/2111.11429},
  eprinttype = {arXiv},
  eprint    = {2111.11429},
  timestamp = {Fri, 26 Nov 2021 13:48:43 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2111-11429.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

@article{Cai_2019,
   title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
   ISSN={1939-3539},
   url={http://dx.doi.org/10.1109/tpami.2019.2956516},
   DOI={10.1109/tpami.2019.2956516},
   journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
   author={Cai, Zhaowei and Vasconcelos, Nuno},
   year={2019},
   pages={1–1}
}

Reproduce/reimplementation QueryInst based on PaddleDetection.

https://openi.pcl.ac.cn/Reproduction/queryinst-paddle/src/branch/reimpl/configs/queryinst

Python Markdown C++ Text Shell other

jerrywgz@126.com 742925032@qq.com dengkaipeng@baidu.com nemonameless@qq.com ghostxsl@users.noreply.github.com 69842442+wangxinxin08@users.noreply.github.com wenyu.lyu@gmail.com liuhui29@baidu.com 31800336+zhiboniu@users.noreply.github.com zoooo0820@qq.com dangqingqing@baidu.com yangzhang@live.com zhiboniu@163.com 48054808+YixinKristy@users.noreply.github.com dazhiningsibuqu@163.com 82303451+pkhk-1@users.noreply.github.com 2120160898@bit.edu.cn 1290573099@qq.com wanghaoshuang@baidu.com me@ethanbai.com 245467267@qq.com slf12thuss@163.com 576550767@qq.com yuan.gao.gavin@gmail.com 53417456+ucsk@users.noreply.github.com

How to access data resources in code

README.md

Vision Transformer Detection

Introduction

Model Zoo

Citations

Contributors (25+) All

Contributors (25+)
All