Models
Family-agnostic interface
create_pretrained is the symbol-dispatched entry point for loading released weights. It returns the model and a closure that loads the HuggingFace checkpoint into a (ps, st) pair you produce with Lux.setup:
model, load = create_pretrained(variant)
ps, st = Lux.setup(rng, model)
ps, st = load(ps, st)The closure captures variant, in_chans, num_classes, and the HF / prefix kwargs at construction time, so the loader body no longer needs to introspect ps to recover what you already told it. create_model is the random-init counterpart: it returns the bare @compact model with no weights loaded. See Getting Started for the nested pattern with prefix.
Luximm.Models.create_pretrained — Function
create_pretrained(variant; in_chans=3, num_classes=nothing,
revision="main", cache_dir=hf_hub_cache_dir(),
prefix=()) -> (model, load)Family-agnostic pretrained-weight entry point, mirroring timm.create_model(..., pretrained=True). Returns the model and a closure that loads the released model.safetensors into a (ps, st) pair the caller produced with Lux.setup. The closure captures variant, in_chans, num_classes, and the HF / prefix kwargs at construction time, so calling it is the only place (ps, st) need to be threaded.
model, load = create_pretrained(:resnet50_a1_in1k)
ps, st = Lux.setup(Xoshiro(0), model)
ps, st = load(ps, st)num_classes = nothing (the default) builds the head the released checkpoint ships with — default_num_classes(variant). Pass an explicit 0 for a features-only model, or any other Int to swap in a custom-width head (the released classifier is then skipped and the warning case fires).
For composition, build model separately and pass it into an outer @compact, capturing prefix = (:backbone,) so the closure writes into the right subtree:
backbone, load_backbone = create_pretrained(:resnet50_a1_in1k;
num_classes = 0, prefix = (:backbone,))
outer = @compact(backbone = backbone,
head = Dense(2048 => num_outputs)) do x
head(backbone(x))
end
ps, st = Lux.setup(rng, outer)
ps, st = load_backbone(ps, st)Luximm.Models.create_model — Function
create_model(variant; kwargs...) -> modelFamily-agnostic random-init model constructor, mirroring timm.create_model(..., pretrained=False). Dispatches on variant to the matching family constructor and returns the bare @compact model — no parameters, no state, no pretrained weights.
Use this when you want to train from scratch, or as a building block inside an outer @compact when composing a larger model. To load the released weights for a variant, use create_pretrained instead.
model = create_model(:resnet50_a1_in1k; num_classes = 1000)
ps, st = Lux.setup(rng, model) # random init, ready for trainingkwargs are forwarded to the family constructor (in_chans, num_classes).
Luximm.Models.default_num_classes — Function
default_num_classes(variant) -> IntHead dimension the released checkpoint for variant was trained at. Returns 0 for encoder-only variants (DINOv3 ConvNeXt, ConvNeXtV2 fcmae pretrains).
Per-family namespaces
Each family exports its variant config struct and the <FAMILY>_VARIANTS registry dict. The remaining family internals (per-family constructors, weight mappings, state mappings) live in Luximm.Models.* for callers who need to escape the create_pretrained / create_model front door.
ResNet
Luximm.Models.ResNetVariant — Type
ResNetVariantArchitectural config for a classic timm ResNet variant.
Fields:
name: lookup key (e.g.:resnet50_a1_in1k).block: residual block type, either:basic(used by r18/r34) or:bottleneck(used by r50/r101/r152).layers: per-stage block count(d1, d2, d3, d4).planes: base channel widths per stage(64, 128, 256, 512). Multiplied by 4 inside:bottleneckstages to give the actual output channel count.num_features: backbone output channels (planes[end]for:basic,planes[end] * 4for:bottleneck).hf_repo: HuggingFace repo containingmodel.safetensors.default_num_classes: head dimension the released weights ship with.default_input_size: native training resolution (224 for every registered variant). Informational only: the model is fully convolutional and accepts any size.
Luximm.Models.RESNET_VARIANTS — Constant
RESNET_VARIANTS :: Dict{Symbol, ResNetVariant}Lookup table for classic ResNet variants currently ported from timm. Keys are the timm model names with dots rewritten as underscores.
Registered variants
| Variant | num_classes | num_features | input size |
|---|---|---|---|
:resnet101_a1_in1k | 1000 | 2048 | 224 |
:resnet152_a1_in1k | 1000 | 2048 | 224 |
:resnet18_a1_in1k | 1000 | 512 | 224 |
:resnet34_a1_in1k | 1000 | 512 | 224 |
:resnet50_a1_in1k | 1000 | 2048 | 224 |
BiT ResNetV2
Luximm.Models.BiTVariant — Type
BiTVariantArchitectural config for a single BiT ResNetV2 variant.
Fields:
name: lookup key (e.g.:resnetv2_50x1_bit_goog_in21k).layers: per-stage depth tuple (3,4,6,3) for r50, (3,4,23,3) for r101, (3,8,36,3) for r152.width_factor: integer width multiplier from the timm name suffix (x1,x2,x3,x4).stem_chs: stem output channels (64 * width_factor).stage_chs: per-stage output channel tuple (base widths(256,512,1024,2048)scaled bywidth_factor).num_features: backbone output channels (stage_chs[end]).hf_repo: HuggingFace repo containingmodel.safetensors.default_num_classes: head dimension the released weights were trained with (21843 forgoog_in21k, 1000 for the in1k tags).default_input_size: native training resolution (224 for most tags, 384 for the_384teacher variant). The model itself is fully convolutional and accepts any input size; this is just what the released weights were tuned at.
Luximm.Models.BIT_VARIANTS — Constant
BIT_VARIANTS :: Dict{Symbol, BiTVariant}Lookup table for the BiT variants this package currently ports. Keys mirror the timm model name with the dot rewritten as an underscore (the dot is reserved in Julia identifiers); the full timm name with the dot lives at BIT_VARIANTS[key].hf_repo.
Registered variants
| Variant | num_classes | num_features | input size |
|---|---|---|---|
:resnetv2_101x1_bit_goog_in21k | 21843 | 2048 | 224 |
:resnetv2_101x1_bit_goog_in21k_ft_in1k | 1000 | 2048 | 224 |
:resnetv2_101x3_bit_goog_in21k | 21843 | 6144 | 224 |
:resnetv2_101x3_bit_goog_in21k_ft_in1k | 1000 | 6144 | 224 |
:resnetv2_152x2_bit_goog_in21k | 21843 | 4096 | 224 |
:resnetv2_152x2_bit_goog_in21k_ft_in1k | 1000 | 4096 | 224 |
:resnetv2_152x2_bit_goog_teacher_in21k_ft_in1k | 1000 | 4096 | 224 |
:resnetv2_152x2_bit_goog_teacher_in21k_ft_in1k_384 | 1000 | 4096 | 384 |
:resnetv2_152x4_bit_goog_in21k | 21843 | 8192 | 224 |
:resnetv2_152x4_bit_goog_in21k_ft_in1k | 1000 | 8192 | 224 |
:resnetv2_50x1_bit_goog_distilled_in1k | 1000 | 2048 | 224 |
:resnetv2_50x1_bit_goog_in21k | 21843 | 2048 | 224 |
:resnetv2_50x1_bit_goog_in21k_ft_in1k | 1000 | 2048 | 224 |
:resnetv2_50x3_bit_goog_in21k | 21843 | 6144 | 224 |
:resnetv2_50x3_bit_goog_in21k_ft_in1k | 1000 | 6144 | 224 |
ConvNeXt
Luximm.Models.ConvNeXtVariant — Type
ConvNeXtVariantArchitectural config for a single ConvNeXt v1 variant.
Fields:
name: lookup key (e.g.:convnext_tiny_dinov3_lvd1689m).depths: per-stage block count,(d1, d2, d3, d4).dims: per-stage channel widths,(c1, c2, c3, c4).c1is also the stem output channels.c4isnum_features.hf_repo: HuggingFace repo containingmodel.safetensors.default_num_classes: head dimension the released weights ship with.0for the DINO encoders (no usable head).default_input_size: native training resolution (224, 384, …) for the released checkpoint. Informational only: the model is fully convolutional and accepts any size, so this is not enforced.ls_init: LayerScale init value (gammaparameter in timm). All v1 variants released so far use1e-6; kept as a field in case future ports need a different value.
Luximm.Models.CONVNEXT_VARIANTS — Constant
CONVNEXT_VARIANTS :: Dict{Symbol, ConvNeXtVariant}Lookup table for the ConvNeXt v1 variants this package ports: the DINOv3 encoders and the Facebook AI checkpoints from the original ConvNeXt paper. Additional convnext_* lineages (.in12k_*, .clip_*) can be registered without touching the constructor or mapping code.
The four :convnext_*_dinov3_lvd1689m encoders are released by Meta under the DINOv3 License, which imposes obligations on outputs derived from the weights that differ from a standard permissive open-source license. Read the license before using the weights for any downstream task. This applies only to the weights; the Julia code in this package is Apache 2.0. The Facebook AI .fb_* checkpoints carry the upstream Apache 2.0 license and are unaffected.
Registered variants
| Variant | num_classes | num_features | input size |
|---|---|---|---|
:convnext_base_dinov3_lvd1689m | 0 | 1024 | 224 |
:convnext_base_fb_in1k | 1000 | 1024 | 224 |
:convnext_base_fb_in22k | 21841 | 1024 | 224 |
:convnext_base_fb_in22k_ft_in1k | 1000 | 1024 | 224 |
:convnext_base_fb_in22k_ft_in1k_384 | 1000 | 1024 | 384 |
:convnext_large_dinov3_lvd1689m | 0 | 1536 | 224 |
:convnext_large_fb_in1k | 1000 | 1536 | 224 |
:convnext_large_fb_in22k | 21841 | 1536 | 224 |
:convnext_large_fb_in22k_ft_in1k | 1000 | 1536 | 224 |
:convnext_large_fb_in22k_ft_in1k_384 | 1000 | 1536 | 384 |
:convnext_small_dinov3_lvd1689m | 0 | 768 | 224 |
:convnext_small_fb_in1k | 1000 | 768 | 224 |
:convnext_small_fb_in22k | 21841 | 768 | 224 |
:convnext_small_fb_in22k_ft_in1k | 1000 | 768 | 224 |
:convnext_small_fb_in22k_ft_in1k_384 | 1000 | 768 | 384 |
:convnext_tiny_dinov3_lvd1689m | 0 | 768 | 224 |
:convnext_tiny_fb_in1k | 1000 | 768 | 224 |
:convnext_tiny_fb_in22k | 21841 | 768 | 224 |
:convnext_tiny_fb_in22k_ft_in1k | 1000 | 768 | 224 |
:convnext_tiny_fb_in22k_ft_in1k_384 | 1000 | 768 | 384 |
:convnext_xlarge_fb_in22k | 21841 | 2048 | 224 |
:convnext_xlarge_fb_in22k_ft_in1k | 1000 | 2048 | 224 |
:convnext_xlarge_fb_in22k_ft_in1k_384 | 1000 | 2048 | 384 |
ConvNeXt V2
Luximm.Models.ConvNeXtV2Variant — Type
ConvNeXtV2VariantArchitectural config for a single ConvNeXtV2 variant.
Fields:
name: lookup key (e.g.:convnextv2_atto_fcmae).depths: per-stage block count,(d1, d2, d3, d4).dims: per-stage channel widths,(c1, c2, c3, c4).c1is also the stem output channels.c4isnum_features.hf_repo: HuggingFace repo containingmodel.safetensors.default_num_classes: head dimension the released weights ship with.0for the bare.fcmaeencoders,1000for the ImageNet-1K and ImageNet-22k-then-1K fine-tunes.default_input_size: native training resolution (224, 384, or 512) for the released checkpoint. Informational only: the model is fully convolutional and accepts any size, so this is not enforced.
Luximm.Models.CONVNEXTV2_VARIANTS — Constant
CONVNEXTV2_VARIANTS :: Dict{Symbol, ConvNeXtV2Variant}Lookup table for the ConvNeXtV2 variants this package ports. The .fcmae rows are the bare encoders; all other rows ship a 1000-class ImageNet head. convnextv2_small is not included because timm only registers it as .untrained (no pretrained weights).
Every ConvNeXtV2 checkpoint is released by Meta under Creative Commons Attribution-NonCommercial 4.0. Commercial use of these weights is not permitted. This applies to every row in the variant table below and is independent of Luximm.jl's own Apache 2.0 code license. If commercial use matters, BiT (Apache 2.0) or the ConvNeXt v1 .fb_* checkpoints (Apache 2.0) are the alternatives.