Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
2 years ago | |
---|---|---|
convert | 2 years ago | |
timm | 2 years ago | |
.gitattributes | 2 years ago | |
.gitignore | 2 years ago | |
LICENSE | 2 years ago | |
MANIFEST.in | 2 years ago | |
README.md | 2 years ago | |
avg_checkpoints.py | 2 years ago | |
benchmark.py | 2 years ago | |
bulk_runner.py | 2 years ago | |
clean_checkpoint.py | 2 years ago | |
distributed_train.sh | 2 years ago | |
hubconf.py | 2 years ago | |
inference.py | 2 years ago | |
mkdocs.yml | 2 years ago | |
model-index.yml | 2 years ago | |
requirements-docs.txt | 2 years ago | |
requirements-modelindex.txt | 2 years ago | |
requirements.txt | 2 years ago | |
setup.cfg | 2 years ago | |
setup.py | 2 years ago | |
train.py | 2 years ago | |
validate.py | 2 years ago |
Thanks to the following for hardware support:
And a big thanks to all GitHub sponsors who helped with some of my costs before I joined Hugging Face.
--amp-impl apex
, bfloat16 supportedf via --amp-dtype bfloat16
maxxvit
series, incl first ConvNeXt block based coatnext
and maxxvit
experiments:
coatnext_nano_rw_224
- 82.0 @ 224 (G) -- (uses ConvNeXt conv block, no BatchNorm)maxxvit_rmlp_nano_rw_256
- 83.0 @ 256, 83.7 @ 320 (G) (uses ConvNeXt conv block, no BN)maxvit_rmlp_small_rw_224
- 84.5 @ 224, 85.1 @ 320 (G)maxxvit_rmlp_small_rw_256
- 84.6 @ 256, 84.9 @ 288 (G) -- could be trained better, hparams need tuning (uses ConvNeXt block, no BN)coatnet_rmlp_2_rw_224
- 84.6 @ 224, 85 @ 320 (T)timm
docs home now exists, look for more here in the futuremaxxvit
series incl a pico
(7.5M params, 1.9 GMACs), two tiny
variants:
maxvit_rmlp_pico_rw_256
- 80.5 @ 256, 81.3 @ 320 (T)maxvit_tiny_rw_224
- 83.5 @ 224 (G)maxvit_rmlp_tiny_rw_256
- 84.2 @ 256, 84.8 @ 320 (T)maxvit_rmlp_nano_rw_256
- 83.0 @ 256, 83.6 @ 320 (T)timm
original models
maxxvit.py
model def, contains numerous experiments outside scope of original paperscoatnet_nano_rw_224
- 81.7 @ 224 (T)coatnet_rmlp_nano_rw_224
- 82.0 @ 224, 82.8 @ 320 (T)coatnet_0_rw_224
- 82.4 (T) -- NOTE timm '0' coatnets have 2 more 3rd stage blockscoatnet_bn_0_rw_224
- 82.4 (T)maxvit_nano_rw_256
- 82.9 @ 256 (T)coatnet_rmlp_1_rw_224
- 83.4 @ 224, 84 @ 320 (T)coatnet_1_rw_224
- 83.6 @ 224 (G)bits_and_tpu
branch training code, (G) = GPU trainedtimm
re-write for license purposes)convnext_atto
- 75.7 @ 224, 77.0 @ 288convnext_atto_ols
- 75.9 @ 224, 77.2 @ 288convnext_femto
- 77.5 @ 224, 78.7 @ 288convnext_femto_ols
- 77.9 @ 224, 78.9 @ 288convnext_pico
- 79.5 @ 224, 80.4 @ 288convnext_pico_ols
- 79.5 @ 224, 80.5 @ 288convnext_nano_ols
- 80.9 @ 224, 81.6 @ 288darknetaa53
- 79.8 @ 256, 80.5 @ 288convnext_nano
- 80.8 @ 224, 81.5 @ 288cs3sedarknet_l
- 81.2 @ 256, 81.8 @ 288cs3darknet_x
- 81.8 @ 256, 82.2 @ 288cs3sedarknet_x
- 82.2 @ 256, 82.7 @ 288cs3edgenet_x
- 82.2 @ 256, 82.7 @ 288cs3se_edgenet_x
- 82.8 @ 256, 83.5 @ 320cs3*
weights above all trained on TPU w/ bits_and_tpu
branch. Thanks to TRC program!More models, more fixes
ResNet
defs added by request with 1 block repeats for both basic and bottleneck (resnet10 and resnet14)CspNet
refactored with dataclass config, simplified CrossStage3 (cs3
) option. These are closer to YOLO-v5+ backbone defs.srelpos
(shared relative position) models trained, and a medium w/ class token.small
model. Better than original small, but not their new USI trained weights.resnet10t
- 66.5 @ 176, 68.3 @ 224resnet14t
- 71.3 @ 176, 72.3 @ 224resnetaa50
- 80.6 @ 224 , 81.6 @ 288darknet53
- 80.0 @ 256, 80.5 @ 288cs3darknet_m
- 77.0 @ 256, 77.6 @ 288cs3darknet_focus_m
- 76.7 @ 256, 77.3 @ 288cs3darknet_l
- 80.4 @ 256, 80.9 @ 288cs3darknet_focus_l
- 80.3 @ 256, 80.9 @ 288vit_srelpos_small_patch16_224
- 81.1 @ 224, 82.1 @ 320vit_srelpos_medium_patch16_224
- 82.3 @ 224, 83.1 @ 320vit_relpos_small_patch16_cls_224
- 82.6 @ 224, 83.6 @ 320edgnext_small_rw
- 79.6 @ 224, 80.4 @ 320cs3
, darknet
, and vit_*relpos
weights above all trained on TPU thanks to TRC program! Rest trained on overheating GPUs.timm
datasets/readers. See (https://github.com/rwightman/pytorch-image-models/pull/1274#issuecomment-1178303103)F.layer_norm(x.permute(0, 2, 3, 1), ...).permute(0, 3, 1, 2)
via LayerNorm2d
in all cases.
LayerNormExp2d
in models/layers/norm.py
timm
Swin-V2-CR impl, will likely do a bit more to bring parts closer to official and decide whether to merge some aspects.vit_relpos_small_patch16_224
- 81.5 @ 224, 82.5 @ 320 -- rel pos, layer scale, no class token, avg poolvit_relpos_medium_patch16_rpn_224
- 82.3 @ 224, 83.1 @ 320 -- rel pos + res-post-norm, no class token, avg poolvit_relpos_medium_patch16_224
- 82.5 @ 224, 83.3 @ 320 -- rel pos, layer scale, no class token, avg poolvit_relpos_base_patch16_gapcls_224
- 82.8 @ 224, 83.9 @ 320 -- rel pos, layer scale, class token, avg pool (by mistake)vision_transformer_relpos.py
) and Residual Post-Norm branches (from Swin-V2) (vision_transformer*.py
)
vit_relpos_base_patch32_plus_rpn_256
- 79.5 @ 256, 80.6 @ 320 -- rel pos + extended width + res-post-norm, no class token, avg poolvit_relpos_base_patch16_224
- 82.5 @ 224, 83.6 @ 320 -- rel pos, layer scale, no class token, avg poolvit_base_patch16_rpn_224
- 82.3 @ 224 -- rel pos + res-post-norm, no class token, avg poolHow to Train Your ViT
)vit_*
models support removal of class token, use of global average pool, use of fc_norm (ala beit, mae).timm
models are now officially supported in fast.ai! Just in time for the new Practical Deep Learning course. timmdocs
documentation link updated to timm.fast.ai.seresnext101d_32x8d
- 83.69 @ 224, 84.35 @ 288seresnextaa101d_32x8d
(anti-aliased w/ AvgPool2d) - 83.85 @ 224, 84.57 @ 288ParallelBlock
and LayerScale
option to base vit models to support model configs in Three things everyone should know about ViTconvnext_tiny_hnf
(head norm first) weights trained with (close to) A2 recipe, 82.2% top-1, could do better with more epochs.norm_norm_norm
. IMPORTANT this update for a coming 0.6.x release will likely de-stabilize the master branch for a while. Branch 0.5.x
or a previous 0.5.x release can be used if stability is required.regnety_040
- 82.3 @ 224, 82.96 @ 288regnety_064
- 83.0 @ 224, 83.65 @ 288regnety_080
- 83.17 @ 224, 83.86 @ 288regnetv_040
- 82.44 @ 224, 83.18 @ 288 (timm pre-act)regnetv_064
- 83.1 @ 224, 83.71 @ 288 (timm pre-act)regnetz_040
- 83.67 @ 256, 84.25 @ 320regnetz_040h
- 83.77 @ 256, 84.5 @ 320 (w/ extra fc in head)resnetv2_50d_gn
- 80.8 @ 224, 81.96 @ 288 (pre-act GroupNorm)resnetv2_50d_evos
80.77 @ 224, 82.04 @ 288 (pre-act EvoNormS)regnetz_c16_evos
- 81.9 @ 256, 82.64 @ 320 (EvoNormS)regnetz_d8_evos
- 83.42 @ 256, 84.04 @ 320 (EvoNormS)xception41p
- 82 @ 299 (timm pre-act)xception65
- 83.17 @ 299xception65p
- 83.14 @ 299 (timm pre-act)resnext101_64x4d
- 82.46 @ 224, 83.16 @ 288seresnext101_32x8d
- 83.57 @ 224, 84.270 @ 288resnetrs200
- 83.85 @ 256, 84.44 @ 320forward_head(x, pre_logits=False)
fn added to all models to allow separate calls of forward_features
+ forward_head
foward_features
, for consistency with CNN models, token selection or pooling now applied in forward_head
timm
on his blog yesterday. Well worth a read. Getting Started with PyTorch Image Models (timm): A Practitioner’s Guidenorm_norm_norm
branch back to master (ver 0.6.x) in next week or so.
pip install git+https://github.com/rwightman/pytorch-image-models
installs!0.5.x
releases and a 0.5.x
branch will remain stable with a cherry pick or two until dust clears. Recommend sticking to pypi install for a bit if you want stable.mnasnet_small
- 65.6 top-1mobilenetv2_050
- 65.9lcnet_100/075/050
- 72.1 / 68.8 / 63.1semnasnet_075
- 73fbnetv3_b/d/g
- 79.1 / 79.7 / 82.0eca_halonext26ts
- 79.5 @ 256resnet50_gn
(new) - 80.1 @ 224, 81.3 @ 288resnet50
- 80.7 @ 224, 80.9 @ 288 (trained at 176, not replacing current a1 weights as default since these don't scale as well to higher res, weights)resnext50_32x4d
- 81.1 @ 224, 82.0 @ 288sebotnet33ts_256
(new) - 81.2 @ 224lamhalobotnet50ts_256
- 81.5 @ 256halonet50ts
- 81.7 @ 256halo2botnet50ts_256
- 82.0 @ 256resnet101
- 82.0 @ 224, 82.8 @ 288resnetv2_101
(new) - 82.1 @ 224, 83.0 @ 288resnet152
- 82.8 @ 224, 83.5 @ 288regnetz_d8
(new) - 83.5 @ 256, 84.0 @ 320regnetz_e8
(new) - 84.5 @ 256, 85.0 @ 320vit_base_patch8_224
(85.8 top-1) & in21k
variant weights added thanks Martins Bruveristimm bits
branch).data
, a bit more consistency, unit tests for all!PyTorch Image Models (timm
) is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results.
The work of many others is present here. I've tried to make sure all source material is acknowledged via links to github, arxiv papers, etc in the README, documentation, and code docstrings. Please let me know if I missed anything.
All model architecture families include variants with pretrained weights. There are specific model variants without any weights, it is NOT a bug. Help training new or better weights is always appreciated. Here are some example training hparams to get you started.
A full version of the list below with source links can be found in the documentation.
Several (less common) features that I often utilize in my projects are included. Many of their additions are the reason why I maintain my own set of models, instead of using others' via PIP:
get_classifier
and reset_classifier
forward_features
(see documentation)create_model(name, features_only=True, out_indices=..., output_stride=...)
out_indices
creation arg specifies which feature maps to return, these indices are 0 based and generally correspond to the C(i + 1)
feature level.output_stride
creation arg controls output stride of the network by using dilated convolutions. Most networks are stride 32 by default. Not all networks support this..feature_info
memberstep
, cosine
w/ restarts, tanh
w/ restarts, plateau
rmsprop_tf
adapted from PyTorch RMSProp by myself. Reproduces much improved Tensorflow RMSProp behaviour.radam
by Liyuan Liu (https://arxiv.org/abs/1908.03265)novograd
by Masashi Kimura (https://arxiv.org/abs/1905.11286)lookahead
adapted from impl by Liam (https://arxiv.org/abs/1907.08610)fused<name>
optimizers by name with NVIDIA Apex installedadamp
and sgdp
by Naver ClovAI (https://arxiv.org/abs/2006.08217)adafactor
adapted from FAIRSeq impl (https://arxiv.org/abs/1804.04235)adahessian
by David Samuel (https://arxiv.org/abs/2006.00719)Model validation results can be found in the documentation and in the results tables
My current documentation for timm
covers the basics.
Hugging Face timm
docs will be the documentation focus going forward and will eventually replace the github.io
docs above.
Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide by Chris Hughes is an extensive blog post covering many aspects of timm
in detail.
timmdocs is quickly becoming a much more comprehensive set of documentation for timm
. A big thanks to Aman Arora for his efforts creating timmdocs.
paperswithcode is a good resource for browsing the models within timm
.
The root folder of the repository contains reference train, validation, and inference scripts that work with the included models and other features of this repository. They are adaptable for other datasets and use cases with a little hacking. See documentation for some basics and training hparams for some train examples that produce SOTA ImageNet results.
One of the greatest assets of PyTorch is the community and their contributions. A few of my favourite resources that pair well with the models and components here are listed below.
The code here is licensed Apache 2.0. I've taken care to make sure any third party code included or adapted has compatible (permissive) licenses such as MIT, BSD, etc. I've made an effort to avoid any GPL / LGPL conflicts. That said, it is your responsibility to ensure you comply with licenses here and conditions of any dependent licenses. Where applicable, I've linked the sources/references for various components in docstrings. If you think I've missed anything please create an issue.
So far all of the pretrained weights available here are pretrained on ImageNet with a select few that have some additional pretraining (see extra note below). ImageNet was released for non-commercial research purposes only (https://image-net.org/download). It's not clear what the implications of that are for the use of pretrained weights from that dataset. Any models I have trained with ImageNet are done for research purposes and one should assume that the original dataset license applies to the weights. It's best to seek legal advice if you intend to use the pretrained weights in a commercial product.
Several weights included or references here were pretrained with proprietary datasets that I do not have access to. These include the Facebook WSL, SSL, SWSL ResNe(Xt) and the Google Noisy Student EfficientNet models. The Facebook models have an explicit non-commercial license (CC-BY-NC 4.0, https://github.com/facebookresearch/semi-supervised-ImageNet1K-models, https://github.com/facebookresearch/WSL-Images). The Google models do not appear to have any restriction beyond the Apache 2.0 license (and ImageNet concerns). In either case, you should contact Facebook or Google with any questions.
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
}
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》