timm

PyTorch Image Models

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

# PyTorch Image Models - [What's New](#whats-new) - [Introduction](#introduction) - [Models](#models) - [Features](#features) - [Results](#results) - [Getting Started (Documentation)](#getting-started-documentation) - [Train, Validation, Inference Scripts](#train-validation-inference-scripts) - [Awesome PyTorch Resources](#awesome-pytorch-resources) - [Licenses](#licenses) - [Citing](#citing) ## What's New ## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection / MLP dimensions for attention pooling layers. * Hiera model F.SDPA optimization to allow Flash Attention kernel use. * Caution added to SGDP optimizer. * Release 1.0.26. First maintenance release since my departure from Hugging Face. ## Feb 23, 2026 * Add token distillation training support to distillation task wrappers * Remove some torch.jit usage in prep for official deprecation * Caution added to AdamP optimizer * Call reset_parameters() even if meta-device init so that buffers get init w/ hacks like init_empty_weights * Tweak Muon optimizer to work with DTensor/FSDP2 (clamp_ instead of clamp_min_, alternate NS branch for DTensor) * Release 1.0.25 ## Jan 21, 2026 * **Compat Break**: Fix oversight w/ QKV vs MLP bias in `ParallelScalingBlock` (& `DiffParallelScalingBlock`) * Does not impact any trained `timm` models but could impact downstream use. ## Jan 5 & 6, 2026 * Release 1.0.24 * Add new benchmark result csv files for inference timing on all models w/ RTX Pro 6000, 5090, and 4090 cards w/ PyTorch 2.9.1 * Fix moved module error in deprecated timm.models.layers import path that impacts legacy imports * Release 1.0.23 ## Dec 30, 2025 * Add better NAdaMuon trained `dpwee`, `dwee`, `dlittle` (differential) ViTs with a small boost over previous runs * https://huggingface.co/timm/vit_dlittle_patch16_reg1_gap_256.sbb_nadamuon_in1k (83.24% top-1) * https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.80% top-1) * https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.67% top-1) * Add a ~21M param `timm` variant of the CSATv2 model at 512x512 & 640x640 * https://huggingface.co/timm/csatv2_21m.sw_r640_in1k (83.13% top-1) * https://huggingface.co/timm/csatv2_21m.sw_r512_in1k (82.58% top-1) * Factor non-persistent param init out of `__init__` into a common method that can be externally called via `init_non_persistent_buffers()` after meta-device init. ## Dec 12, 2025 * Add CSATV2 model (thanks https://github.com/gusdlf93) -- a lightweight but high res model with DCT stem & spatial attention. https://huggingface.co/Hyunil/CSATv2 * Add AdaMuon and NAdaMuon optimizer support to existing `timm` Muon impl. Appears more competitive vs AdamW with familiar hparams for image tasks. * End of year PR cleanup, merge aspects of several long open PR * Merge differential attention (`DiffAttention`), add corresponding `DiffParallelScalingBlock` (for ViT), train some wee vits * https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_in1k * https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_in1k * Add a few pooling modules, `LsePlus` and `SimPool` * Cleanup, optimize `DropBlock2d` (also add support to ByobNet based models) * Bump unit tests to PyTorch 2.9.1 + Python 3.13 on upper end, lower still PyTorch 1.13 + Python 3.10 ## Dec 1, 2025 * Add lightweight task abstraction, add logits and feature distillation support to train script via new tasks. * Remove old APEX AMP support ## Nov 4, 2025 * Fix LayerScale / LayerScale2d init bug (init values ignored), introduced in 1.0.21. Thanks https://github.com/Ilya-Fradlin * Release 1.0.22 ## Oct 31, 2025 🎃 * Update imagenet & OOD variant result csv files to include a few new models and verify correctness over several torch & timm versions * EfficientNet-X and EfficientNet-H B5 model weights added as part of a hparam search for AdamW vs Muon (still iterating on Muon runs) ## Oct 16-20, 2025 * Add an impl of the Muon optimizer (based on https://github.com/KellerJordan/Muon) with customizations * extra flexibility and improved handling for conv weights and fallbacks for weight shapes not suited for orthogonalization * small speedup for NS iterations by reducing allocs and using fused (b)add(b)mm ops * by default uses AdamW (or NAdamW if `nesterov=True`) updates if muon not suitable for parameter shape (or excluded via param group flag) * like torch impl, select from several LR scale adjustment fns via `adjust_lr_fn` * select from several NS coefficient presets or specify your own vi

Release History

Version	Changes	Urgency	Date
v1.0.27	## April 23, 2026 * Add Gemma4 ViT encoders w/ NaFlex pipeline support (variable aspect/size per image). Thanks [Yonghye Kwon](https://github.com/developer0hye) * Support DINOv3 weights in NaFlexVit. Thanks [Yonghye Kwon](https://github.com/developer0hye) * Some improvements to Muon fallback (AdamW/NadamW) lr behavior ## What's Changed * 🔒 Pin GitHub Actions to commit SHAs by @paulinebm in https://github.com/huggingface/pytorch-image-models/pull/2689 * Improve fallback (adamw/nadamw) LR	High	5/8/2026
1.0.26	Imported from PyPI (1.0.26)	Low	4/21/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Medium	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.26	## March 23, 2026 * Improve pickle checkpoint handling security. Default all loading to `weights_only=True`, add safe_global for ArgParse. * Improve attention mask handling for core ViT/EVA models & layers. Resolve bool masks, pass `is_causal` through for SSL tasks. * Fix class & register token uses with ViT and no pos embed enabled. * Add Patch Representation Refinement (PRR) as a pooling option in ViT. Thanks Sina (https://github.com/sinahmr). * Improve consistency of output projection /	Low	3/23/2026
v1.0.25	## Feb 23, 2026 * Add token distillation training support to distillation task wrappers * Remove some torch.jit usage in prep for official deprecation * Caution added to AdamP optimizer * Call reset_parameters() even if meta-device init so that buffers get init w/ hacks like init_empty_weights * Tweak Muon optimizer to work with DTensor/FSDP2 (clamp_ instead of clamp_min_, alternate NS branch for DTensor) * Release 1.0.25 ## Jan 21, 2026 * Compat Break: Fix oversight w/ QKV vs MLP	Low	2/23/2026
v1.0.24	## Jan 5 & 6, 2025 * Patch Release 1.0.24 (fix for 1.0.23) * Add new benchmark result csv files for inference timing on all models w/ RTX Pro 6000, 5090, and 4090 cards w/ PyTorch 2.9.1 * Fix moved module error in deprecated timm.models.layers import path that impacts legacy imports * Release 1.0.23 ## Dec 30, 2025 * Add better NAdaMuon trained `dpwee`, `dwee`, `dlittle` (differential) ViTs with a small boost over previous runs * https://huggingface.co/timm/vit_dlittle_patch16_reg1_ga	Low	1/7/2026
v1.0.23	## Dec 30, 2025 * Add better NAdaMuon trained `dpwee`, `dwee`, `dlittle` (differential) ViTs with a small boost over previous runs * https://huggingface.co/timm/vit_dlittle_patch16_reg1_gap_256.sbb_nadamuon_in1k (83.24% top-1) * https://huggingface.co/timm/vit_dwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.80% top-1) * https://huggingface.co/timm/vit_dpwee_patch16_reg1_gap_256.sbb_nadamuon_in1k (81.67% top-1) * Add a ~21M param `timm` variant of the CSATv2 model at 512x512 & 640x640	Low	1/5/2026
v1.0.22	Patch release for priority LayerScale initialization regression in 1.0.21 ## What's Changed * Add some weights for efficientnet_x / efficientnet_h models by @rwightman in https://github.com/huggingface/pytorch-image-models/pull/2602 * Update result csvs by @rwightman in https://github.com/huggingface/pytorch-image-models/pull/2603 * Fix LayerScale ignoring init_values by @Ilya-Fradlin in https://github.com/huggingface/pytorch-image-models/pull/2605 ## New Contributors * @Ilya-Fradlin m	Low	11/5/2025
v1.0.21	## Oct 16-20, 2025 * Add an impl of the Muon optimizer (based on https://github.com/KellerJordan/Muon) with customizations * extra flexibility and improved handling for conv weights and fallbacks for weight shapes not suited for orthogonalization * small speedup for NS iterations by reducing allocs and using fused (b)add(b)mm ops * by default uses AdamW (or NAdamW if `nesterov=True`) updates if muon not suitable for parameter shape (or excluded via param group flag) * like torch imp	Low	10/24/2025
v1.0.20	## Sept 21, 2025 * Remap DINOv3 ViT weight tags from `lvd_1689m` -> `lvd1689m` to match (same for `sat_493m` -> `sat493m`) * Release 1.0.20 ## Sept 17, 2025 * DINOv3 (https://arxiv.org/abs/2508.10104) ConvNeXt and ViT models added. ConvNeXt models were mapped to existing `timm` model. ViT support done via the EVA base model w/ a new `RotaryEmbeddingDinoV3` to match the DINOv3 specific RoPE impl * HuggingFace Hub: https://huggingface.co/collections/timm/timm-dinov3-68cb08bb0bee365973d52a	Low	9/21/2025
v1.0.19	Patch release for Python 3.9 compat break in 1.0.18 ## July 23, 2025 * Add `set_input_size()` method to EVA models, used by OpenCLIP 3.0.0 to allow resizing for timm based encoder models. * Release 1.0.18, needed for PE-Core S & T models in OpenCLIP 3.0.0 * Fix small typing issue that broke Python 3.9 compat. 1.0.19 patch release. ## July 21, 2025 * ROPE support added to NaFlexViT. All models covered by the EVA base (`eva.py`) including EVA, EVA02, Meta PE ViT, `timm` SBB ViT w/ ROPE,	Low	7/24/2025
v1.0.18	## July 23, 2025 * Add `set_input_size()` method to EVA models, used by OpenCLIP 3.0.0 to allow resizing for timm based encoder models. * Release 1.0.18, needed for PE-Core S & T models in OpenCLIP 3.0.0 ## July 21, 2025 * ROPE support added to NaFlexViT. All models covered by the EVA base (`eva.py`) including EVA, EVA02, Meta PE ViT, `timm` SBB ViT w/ ROPE, and Naver ROPE-ViT can be now loaded in NaFlexViT when `use_naflex=True` passed at model creation time * More Meta PE ViT encoders a	Low	7/23/2025
v1.0.17	## July 7, 2025 * MobileNet-v5 backbone tweaks for improved Google Gemma 3n behaviour (to pair with updated official weights) * Add stem bias (zero'd in updated weights, compat break with old weights) * GELU -> GELU (tanh approx). A minor change to be closer to JAX * Add two arguments to layer-decay support, a min scale clamp and 'no optimization' scale threshold * Add 'Fp32' LayerNorm, RMSNorm, SimpleNorm variants that can be enabled to force computation of norm in float32 * Some typi	Low	7/10/2025
v1.0.16	## June 26, 2025 * MobileNetV5 backbone (w/ encoder only variant) for [Gemma 3n](https://ai.google.dev/gemma/docs/gemma-3n#parameters) image encoder * Version 1.0.16 released ## June 23, 2025 * Add F.grid_sample based 2D and factorized pos embed resize to NaFlexViT. Faster when lots of different sizes (based on example by https://github.com/stas-sl). * Further speed up patch embed resample by replacing vmap with matmul (based on snippet by https://github.com/stas-sl). * Add 3 initial nat	Low	6/26/2025
v1.0.15	## Feb 21, 2025 * SigLIP 2 ViT image encoders added (https://huggingface.co/collections/timm/siglip-2-67b8e72ba08b09dd97aecaf9) * Variable resolution / aspect NaFlex versions are a WIP * Add 'SO150M2' ViT weights trained with SBB recipes, great results, better for ImageNet than previous attempt w/ less training. * `vit_so150m2_patch16_reg1_gap_448.sbb_e200_in12k_ft_in1k` - 88.1% top-1 * `vit_so150m2_patch16_reg1_gap_384.sbb_e200_in12k_ft_in1k` - 87.9% top-1 * `vit_so150m2_patch16_r	Low	2/23/2025
v1.0.14	## Jan 19, 2025 * Fix loading of LeViT safetensor weights, remove conversion code which should have been deactivated * Add 'SO150M' ViT weights trained with SBB recipes, decent results, but not optimal shape for ImageNet-12k/1k pretrain/ft * `vit_so150m_patch16_reg4_gap_256.sbb_e250_in12k_ft_in1k` - 86.7% top-1 * `vit_so150m_patch16_reg4_gap_384.sbb_e250_in12k_ft_in1k` - 87.4% top-1 * `vit_so150m_patch16_reg4_gap_256.sbb_e250_in12k` * Misc typing, typo, etc. cleanup * 1.0.14 release	Low	1/19/2025
v1.0.13	## Jan 9, 2025 * Add support to train and validate in pure `bfloat16` or `float16` * `wandb` project name arg added by https://github.com/caojiaolong, use arg.experiment for name * Fix old issue w/ checkpoint saving not working on filesystem w/o hard-link support (e.g. FUSE fs mounts) * 1.0.13 release ## Jan 6, 2025 * Add `torch.utils.checkpoint.checkpoint()` wrapper in `timm.models` that defaults `use_reentrant=False`, unless `TIMM_REENTRANT_CKPT=1` is set in env. ## Dec 31, 2024 *	Low	1/9/2025
v1.0.12	## Nov 28, 2024 * More optimizers * Add MARS optimizer (https://arxiv.org/abs/2411.10438, https://github.com/AGI-Arena/MARS) * Add LaProp optimizer (https://arxiv.org/abs/2002.04839, https://github.com/Z-T-WANG/LaProp-Optimizer) * Add masking from 'Cautious Optimizers' (https://arxiv.org/abs/2411.16085, https://github.com/kyleliang919/C-Optim) to Adafactor, Adafactor Big Vision, AdamW (legacy), Adopt, Lamb, LaProp, Lion, NadamW, RMSPropTF, SGDW * Cleanup some docstrings and type ann	Low	12/3/2024
v1.0.11	Quick turnaround from 1.0.10 to fix an error impacting 3rd party packages that still import through a deprecated path that isn't tested. ## Oct 16, 2024 * Fix error on importing from deprecated path `timm.models.registry`, increased priority of existing deprecation warnings to be visible * Port weights of InternViT-300M (https://huggingface.co/OpenGVLab/InternViT-300M-448px) to `timm` as `vit_intern300m_patch14_448` ### Oct 14, 2024 * Pre-activation (ResNetV2) version of 18/18d/34/34d R	Low	10/16/2024
v1.0.10	### Oct 14, 2024 * Pre-activation (ResNetV2) version of 18/18d/34/34d ResNet model defs added by request (weights pending) * Release 1.0.10 ### Oct 11, 2024 * MambaOut (https://github.com/yuweihao/MambaOut) model & weights added. A cheeky take on SSM vision models w/o the SSM (essentially ConvNeXt w/ gating). A mix of original weights + custom variations & weights. \|model \|im	Low	10/15/2024
v1.0.9	### Aug 21, 2024 * Updated SBB ViT models trained on ImageNet-12k and fine-tuned on ImageNet-1k, challenging quite a number of much larger, slower models \| model \| top1 \| top5 \| param_count \| img_size \| \| -------------------------------------------------- \| ------ \| ------ \| ----------- \| -------- \| \| [vit_mediumd_patch16_reg4_gap_384.sbb2_e200_in12k_ft_in1k](https://huggingface.co/timm/vit_mediumd_patch16_reg4_gap_384.sbb2_e200_in12k_ft_in1k) \| 87.438 \| 98.256 \| 64.11 \| 384 \| \| [vit_medi	Low	8/23/2024
v1.0.8	### July 28, 2024 * Add `mobilenet_edgetpu_v2_m` weights w/ `ra4` mnv4-small based recipe. 80.1% top-1 @ 224 and 80.7 @ 256. * Release 1.0.8 ### July 26, 2024 * More MobileNet-v4 weights, ImageNet-12k pretrain w/ fine-tunes, and anti-aliased ConvLarge models \| model \|top1 \|top1_err\|top5 \|top5_err\|param_count\|img_size\| \|----------------------------------------------------------------------------	Low	7/29/2024
v1.0.7	### June 12, 2024 * MobileNetV4 models and initial set of `timm` trained weights added: \| model \|top1 \|top1_err\|top5 \|top5_err\|param_count\|img_size\| \|--------------------------------------------------------------------------------------------------\|------\|--------\|------\|--------\|-----------\|--------\| \| [mobilenetv4_hybrid_large.e600_r384_in1k](http://hf.co/timm/mobilenetv4_hybrid_large.e600_r384_i	Low	6/19/2024
v1.0.3	### May 14, 2024 * Support loading PaliGemma jax weights into SigLIP ViT models with average pooling. * Add Hiera models from Meta (https://github.com/facebookresearch/hiera). * Add `normalize=` flag for transorms, return non-normalized torch.Tensor with original dytpe (for `chug`) * Version 1.0.3 release ### May 11, 2024 * `Searching for Better ViT Baselines (For the GPU Poor)` weights and vit variants released. Exploring model shapes between Tiny and Base. \| model \| top1 \| top5 \| pa	Low	5/15/2024
v0.9.16	### Feb 19, 2024 * Next-ViT models added. Adapted from https://github.com/bytedance/Next-ViT * HGNet and PP-HGNetV2 models added. Adapted from https://github.com/PaddlePaddle/PaddleClas by [SeeFun](https://github.com/seefun) * Removed setup.py, moved to pyproject.toml based build supported by PDM * Add updated model EMA impl using _for_each for less overhead * Support device args in train script for non GPU devices * Other misc fixes and small additions * Min supported Python version incr	Low	2/19/2024
v0.9.12	### Nov 23, 2023 * Added EfficientViT-Large models, thanks [SeeFun](https://github.com/seefun) * Fix Python 3.7 compat, will be dropping support for it soon * Other misc fixes * Release 0.9.12	Low	11/24/2023
v0.9.11	### Nov 20, 2023 * Added significant flexibility for Hugging Face Hub based timm models via `model_args` config entry. `model_args` will be passed as kwargs through to models on creation. * See example at https://huggingface.co/gaunernst/vit_base_patch16_1024_128.audiomae_as2m_ft_as20k/blob/main/config.json * Usage: https://github.com/huggingface/pytorch-image-models/discussions/2035 * Updated imagenet eval and test set csv files with latest models * `vision_transformer.py` typing and	Low	11/20/2023
v0.9.10	### Nov 4 * Patch fix for 0.9.9 to fix FrozenBatchnorm2d import path for old torchvision (~2 years ) ### Nov 3, 2023 * [DFN (Data Filtering Networks)](https://huggingface.co/papers/2309.17425) and [MetaCLIP](https://huggingface.co/papers/2309.16671) ViT weights added * DINOv2 'register' ViT model weights added * Add `quickgelu` ViT variants for OpenAI, DFN, MetaCLIP weights that use it (less efficient) * Improved typing added to ResNet, MobileNet-v3 thanks to [Aryan](https://github.com/a	Low	11/4/2023
v0.9.9	### Nov 3, 2023 * [DFN (Data Filtering Networks)](https://huggingface.co/papers/2309.17425) and [MetaCLIP](https://huggingface.co/papers/2309.16671) ViT weights added * DINOv2 'register' ViT model weights added * Add `quickgelu` ViT variants for OpenAI, DFN, MetaCLIP weights that use it (less efficient) * Improved typing added to ResNet, MobileNet-v3 thanks to [Aryan](https://github.com/a-r-r-o-w) * ImageNet-12k fine-tuned (from LAION-2B CLIP) `convnext_xxlarge` * 0.9.9 release	Low	11/3/2023
v0.9.8	### Oct 20, 2023 * [SigLIP](https://huggingface.co/papers/2303.15343) image tower weights supported in `vision_transformer.py`. * Great potential for fine-tune and downstream feature use. * Experimental 'register' support in vit models as per [Vision Transformers Need Registers](https://huggingface.co/papers/2309.16588) * Updated RepViT with new weight release. Thanks [wangao](https://github.com/jameslahm) * Add patch resizing support (on pretrained weight load) to Swin models * 0.9.8 re	Low	10/21/2023
v0.9.7	Small bug fix & extra model from [v0.9.6](https://github.com/huggingface/pytorch-image-models/releases/tag/v0.9.6) ### Sep 1, 2023 * TinyViT added by [SeeFun](https://github.com/seefun) * Fix EfficientViT (MIT) to use torch.autocast so it works back to PT 1.10 * 0.9.7 release	Low	9/2/2023
v0.9.6	### Aug 28, 2023 * Add dynamic img size support to models in `vision_transformer.py`, `vision_transformer_hybrid.py`, `deit.py`, and `eva.py` w/o breaking backward compat. * Add `dynamic_img_size=True` to args at model creation time to allow changing the grid size (interpolate abs and/or ROPE pos embed each forward pass). * Add `dynamic_img_pad=True` to allow image sizes that aren't divisible by patch size (pad bottom right to patch size each forward pass). * Enabling either dynamic mo	Low	8/29/2023
v0.9.5	Minor updates and bug fixes. New ResNeXT w/ highest ImageNet eval I'm aware of in the ResNe(X)t family (`seresnextaa201d_32x8d.sw_in12k_ft_in1k_384`) ### Aug 3, 2023 * Add GluonCV weights for HRNet w18_small and w18_small_v2. Converted by [SeeFun](https://github.com/seefun) * Fix `selecsls` model naming regression Patch and position embedding for ViT/EVA works for bfloat16/float16 weights on load (or activations for on-the-fly resize) * v0.9.5 release prep ### July 27, 2023 * Added	Low	8/3/2023
v0.9.2	* Fix _hub deprecation pass through import	Low	5/14/2023
v0.9.1	The first non pre-release since Oct 2022 with a long list of changes from 0.6.x releases... ### May 12, 2023 * Fix Python 3.7 import error re Final[] typing annotation ### May 11, 2023 * `timm` 0.9 released, transition from 0.8.xdev releases ### May 10, 2023 * Hugging Face Hub downloading is now default, 1132 models on https://huggingface.co/timm, 1163 weights in `timm` * DINOv2 vit feature backbone weights added thanks to [Leng Yue](https://github.com/leng-yue) * FB MAE vit featur	Low	5/12/2023
v0.9.0	First non pre-release in a loooong while, changelog from 0.6.x below... ### May 11, 2023 * `timm` 0.9 released, transition from 0.8.xdev releases ### May 10, 2023 * Hugging Face Hub downloading is now default, 1132 models on https://huggingface.co/timm, 1163 weights in `timm` * DINOv2 vit feature backbone weights added thanks to [Leng Yue](https://github.com/leng-yue) * FB MAE vit feature backbone weights added * OpenCLIP DataComp-XL L/14 feat backbone weights added * MetaFormer (p	Low	5/12/2023
v0.6.13	Release from 0.6.x stable branch with fix for Python 3.11. NOTE original 0.6.13 release tag was against wrong branch.	Low	4/16/2023
v0.8.17dev0	### March 22, 2023 * More weights pushed to HF hub along with multi-weight support, including: `regnet.py`, `rexnet.py`, `byobnet.py`, `resnetv2.py`, `swin_transformer.py`, `swin_transformer_v2.py`, `swin_transformer_v2_cr.py` * Swin Transformer models support feature extraction (NCHW feat maps for `swinv2_cr_`, and NHWC for all others) and spatial embedding outputs. FocalNet (from https://github.com/microsoft/FocalNet) models and weights added with significant refactoring, feature extra	Low	3/24/2023
v0.8.13dev0	### Feb 20, 2023 * Add 320x320 `convnext_large_mlp.clip_laion2b_ft_320` and `convnext_lage_mlp.clip_laion2b_ft_soup_320` CLIP image tower weights for features & fine-tune * 0.8.13dev0 pypi release for latest changes w/ move to huggingface org ### Feb 16, 2023 * `safetensor` checkpoint support added * Add ideas from 'Scaling Vision Transformers to 22 B. Params' (https://arxiv.org/abs/2302.05442) -- qk norm, RmsNorm, parallel block * Add F.scaled_dot_product_attention support (PyTorch 2.	Low	2/20/2023
v0.8.10dev0	### Feb 7, 2023 * New inference benchmark numbers added in [results](results/) folder. * Add convnext LAION CLIP trained weights and initial set of in1k fine-tunes * `convnext_base.clip_laion2b_augreg_ft_in1k` - 86.2% @ 256x256 * `convnext_base.clip_laiona_augreg_ft_in1k_384` - 86.5% @ 384x384 * `convnext_large_mlp.clip_laion2b_augreg_ft_in1k` - 87.3% @ 256x256 * `convnext_large_mlp.clip_laion2b_augreg_ft_in1k_384` - 87.9% @ 384x384 * Add DaViT models. Supports `features_only=Tr	Low	2/7/2023
v0.8.6dev0	### Jan 11, 2023 * Update ConvNeXt ImageNet-12k pretrain series w/ two new fine-tuned weights (and pre FT `.in12k` tags) * `convnext_nano.in12k_ft_in1k` - 82.3 @ 224, 82.9 @ 288 (previously released) * `convnext_tiny.in12k_ft_in1k` - 84.2 @ 224, 84.5 @ 288 * `convnext_small.in12k_ft_in1k` - 85.2 @ 224, 85.3 @ 288 ### Jan 6, 2023 * Finally got around to adding `--model-kwargs` and `--opt-kwargs` to scripts to pass through rare args directly to model classes from cmd line * `trai	Low	1/12/2023
v0.8.2dev0	Part way through the conversion of models to multi-weight support (`model_arch.pretrain_tag`), module reorg for future building, and lots of new weights and model additions as we go... This is considered a development release. Please stick to 0.6.x if you need stability. Some of the model names, tags will shift a bit, some old names have already been deprecated and remapping support not added yet. For code 0.6.x branch is considered 'stable' https://github.com/rwightman/pytorch-image-models/t	Low	12/24/2022
v0.6.12	Minor bug fixes to HF push_to_hub, plus some more MaxVit weights ### Oct 10, 2022 * More weights in `maxxvit` series, incl first ConvNeXt block based `coatnext` and `maxxvit` experiments: * `coatnext_nano_rw_224` - 82.0 @ 224 (G) -- (uses ConvNeXt conv block, no BatchNorm) * `maxxvit_rmlp_nano_rw_256` - 83.0 @ 256, 83.7 @ 320 (G) (uses ConvNeXt conv block, no BN) * `maxvit_rmlp_small_rw_224` - 84.5 @ 224, 85.1 @ 320 (G) * `maxxvit_rmlp_small_rw_256` - 84.6 @ 256, 84.9 @ 288 (G)	Low	11/23/2022
v0.6.11	## Changes Since 0.6.7 ### Sept 23, 2022 * CLIP LAION-2B pretrained B/32, L/14, H/14, and g/14 image tower weights as vit models (for fine-tune) ### Sept 7, 2022 * Hugging Face [`timm` docs](https://huggingface.co/docs/hub/timm) home now exists, look for more here in the future * Add BEiT-v2 weights for base and large 224x224 models from https://github.com/microsoft/unilm/tree/master/beit2 * Add more weights in `maxxvit` series incl a `pico` (7.5M params, 1.9 GMACs), two `tiny` vari	Low	10/3/2022
v0.1-weights-maxx	# CoAtNet (https://arxiv.org/abs/2106.04803) and MaxVit (https://arxiv.org/abs/2204.01697) `timm` trained weights Weights were created reproducing the paper architectures and exploring timm sepcific additions such as ConvNeXt blocks, parallel partitioning, and other experiments. Weights were trained on a mix of TPU and GPU systems. Bulk of weights were trained on TPU via the TRC program (https://sites.research.google/trc/about/). CoAtNet variants run particularly well on TPU, it's a gr	Low	8/24/2022
v0.1-weights-morevit	# More weights for 3rd party ViT / ViT-CNN hybrids that needed remapping / re-hosting ## EfficientFormer Rehosted and remaped checkpoints from https://github.com/snap-research/EfficientFormer (originals in Google Drive) ## GCViT Heavily remaped from originals at https://github.com/NVlabs/GCVit due to from-scratch re-write of model code NOTE: these checkpoints have a non-commercial [CC-BY-NC-SA-4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.	Low	8/17/2022
v0.6.7	Minor bug fixes and a few more weights since 0.6.5 * A few more weights & model defs added: * `darknetaa53` - 79.8 @ 256, 80.5 @ 288 * `convnext_nano` - 80.8 @ 224, 81.5 @ 288 * `cs3sedarknet_l` - 81.2 @ 256, 81.8 @ 288 * `cs3darknet_x` - 81.8 @ 256, 82.2 @ 288 * `cs3sedarknet_x` - 82.2 @ 256, 82.7 @ 288 * `cs3edgenet_x` - 82.2 @ 256, 82.7 @ 288 * `cs3se_edgenet_x` - 82.8 @ 256, 83.5 @ 320 * `cs3*` weights above all trained on TPU w/ `bits_and_tpu` branch. Thanks to TRC	Low	7/27/2022
v0.6.5	First official release in a long while (since 0.5.4). All change log since 0.5.4 below, ### July 8, 2022 More models, more fixes * Official research models (w/ weights) added: * EdgeNeXt from (https://github.com/mmaaz60/EdgeNeXt) * MobileViT-V2 from (https://github.com/apple/ml-cvnets) * DeiT III (Revenge of the ViT) from (https://github.com/facebookresearch/deit) * My own models: * Small `ResNet` defs added by request with 1 block repeats for both basic and bottleneck (resnet1	Low	7/10/2022
v0.1-weights-swinv2	This release holds weights for timm's variant of Swin V2 (from @ChristophReich1996 impl, https://github.com/ChristophReich1996/Swin-Transformer-V2) NOTE: `ns` variants of the models have extra norms on the main branch at the end of each stage, this seems to help training. The current `small` model is not using this, but currently training one. Will have a non-ns tiny soon as well as a comparsion. in21k and 1k base models are also in the works... `small` checkpoints trained on TPU-VM instan	Low	4/3/2022
v0.1-tpu-weights	A wide range of mid-large sized models trained in PyTorch XLA on TPU VM instances. Demonstrating viability of the TPU + PyTorch combo for excellent image model results. All models trained w/ the `bits_and_tpu` branch of this codebase. A big thanks to the TPU Research Cloud (https://sites.research.google/trc/about/) for the compute used in these experiments. This set includes several novel weights, including EvoNorm-S RegNetZ (C/D timm variants) and ResNet-V2 model experiments, as well as	Low	3/18/2022
v0.1-mvit-weights	Pretrained weights for MobileViT and MobileViT-V2 adapted from Apple impl at https://github.com/apple/ml-cvnets Checkpoints remapped to `timm` impl of the model with BGR corrected to RGB (for V1).	Low	1/31/2022
v0.5.4	Release v0.5.4	Low	1/17/2022
v0.1-rsb-weights	# Weights for ResNet Strikes Back Paper: https://arxiv.org/abs/2110.00476 More details on weights and hparams to come...	Low	10/4/2021

Dependencies & License Audit

Loading dependencies...

Similar Packages

pytorch-lightningPyTorch Lightning is the lightweight PyTorch wrapper for ML researchers. Scale your models. Write less boilerplate.2.6.5

nltkNatural Language Toolkitdevelop@2026-06-06

sagemakerOpen source library for training and deploying models on Amazon SageMaker.v3.13.0

transformersTransformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.v5.10.1

azure-search-documentsMicrosoft Azure Cognitive Search Client Library for Pythonazure-mgmt-computelimit_1.1.0

More from pypi

markitdownUtility tool for converting various files to Markdown

fastapiFastAPI framework, high performance, easy to learn, fast to code, ready for production

djangoA high-level Python web framework that encourages rapid development and clean, pragmatic design.

flaskA simple framework for building complex web applications.

More in RAG & Memory

edgequakeEdegQuake 🌋 High-performance GraphRAG inspired from LightRag written in Rust; Transform documents into intelligent knowledge graphs for superior retrieval and generation

nltkNatural Language Toolkit

awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.

generative-ai-for-beginners21 Lessons, Get Started Building with Generative AI