Models

Family-agnostic interface

create_pretrained is the symbol-dispatched entry point for loading released weights. It returns the model and a closure that loads the HuggingFace checkpoint into a (ps, st) pair you produce with Lux.setup:

model, load = create_pretrained(variant)
ps, st = Lux.setup(rng, model)
ps, st = load(ps, st)

The closure captures variant, in_chans, num_classes, and the HF / prefix kwargs at construction time, so the loader body no longer needs to introspect ps to recover what you already told it. create_model is the random-init counterpart: it returns the bare @compact model with no weights loaded. See Getting Started for the nested pattern with prefix.

Luximm.Models.create_pretrained — Function

create_pretrained(variant; in_chans=3, num_classes=nothing,
                  revision="main", cache_dir=hf_hub_cache_dir(),
                  prefix=()) -> (model, load)

Family-agnostic pretrained-weight entry point, mirroring timm.create_model(..., pretrained=True). Returns the model and a closure that loads the released model.safetensors into a (ps, st) pair the caller produced with Lux.setup. The closure captures variant, in_chans, num_classes, and the HF / prefix kwargs at construction time, so calling it is the only place (ps, st) need to be threaded.

model, load = create_pretrained(:resnet50_a1_in1k)
ps, st = Lux.setup(Xoshiro(0), model)
ps, st = load(ps, st)

num_classes = nothing (the default) builds the head the released checkpoint ships with — default_num_classes(variant). Pass an explicit 0 for a features-only model, or any other Int to swap in a custom-width head (the released classifier is then skipped and the warning case fires).

For composition, build model separately and pass it into an outer @compact, capturing prefix = (:backbone,) so the closure writes into the right subtree:

backbone, load_backbone = create_pretrained(:resnet50_a1_in1k;
    num_classes = 0, prefix = (:backbone,))
outer = @compact(backbone = backbone,
    head = Dense(2048 => num_outputs)) do x
    head(backbone(x))
end
ps, st = Lux.setup(rng, outer)
ps, st = load_backbone(ps, st)

Variant	num_classes	num_features	input size
`:resnet101_a1_in1k`	1000	2048	224
`:resnet152_a1_in1k`	1000	2048	224
`:resnet18_a1_in1k`	1000	512	224
`:resnet34_a1_in1k`	1000	512	224
`:resnet50_a1_in1k`	1000	2048	224

Variant	num_classes	num_features	input size
`:resnetv2_101x1_bit_goog_in21k`	21843	2048	224
`:resnetv2_101x1_bit_goog_in21k_ft_in1k`	1000	2048	224
`:resnetv2_101x3_bit_goog_in21k`	21843	6144	224
`:resnetv2_101x3_bit_goog_in21k_ft_in1k`	1000	6144	224
`:resnetv2_152x2_bit_goog_in21k`	21843	4096	224
`:resnetv2_152x2_bit_goog_in21k_ft_in1k`	1000	4096	224
`:resnetv2_152x2_bit_goog_teacher_in21k_ft_in1k`	1000	4096	224
`:resnetv2_152x2_bit_goog_teacher_in21k_ft_in1k_384`	1000	4096	384
`:resnetv2_152x4_bit_goog_in21k`	21843	8192	224
`:resnetv2_152x4_bit_goog_in21k_ft_in1k`	1000	8192	224
`:resnetv2_50x1_bit_goog_distilled_in1k`	1000	2048	224
`:resnetv2_50x1_bit_goog_in21k`	21843	2048	224
`:resnetv2_50x1_bit_goog_in21k_ft_in1k`	1000	2048	224
`:resnetv2_50x3_bit_goog_in21k`	21843	6144	224
`:resnetv2_50x3_bit_goog_in21k_ft_in1k`	1000	6144	224

Variant	num_classes	num_features	input size
`:convnext_base_dinov3_lvd1689m`	0	1024	224
`:convnext_base_fb_in1k`	1000	1024	224
`:convnext_base_fb_in22k`	21841	1024	224
`:convnext_base_fb_in22k_ft_in1k`	1000	1024	224
`:convnext_base_fb_in22k_ft_in1k_384`	1000	1024	384
`:convnext_large_dinov3_lvd1689m`	0	1536	224
`:convnext_large_fb_in1k`	1000	1536	224
`:convnext_large_fb_in22k`	21841	1536	224
`:convnext_large_fb_in22k_ft_in1k`	1000	1536	224
`:convnext_large_fb_in22k_ft_in1k_384`	1000	1536	384
`:convnext_small_dinov3_lvd1689m`	0	768	224
`:convnext_small_fb_in1k`	1000	768	224
`:convnext_small_fb_in22k`	21841	768	224
`:convnext_small_fb_in22k_ft_in1k`	1000	768	224
`:convnext_small_fb_in22k_ft_in1k_384`	1000	768	384
`:convnext_tiny_dinov3_lvd1689m`	0	768	224
`:convnext_tiny_fb_in1k`	1000	768	224
`:convnext_tiny_fb_in22k`	21841	768	224
`:convnext_tiny_fb_in22k_ft_in1k`	1000	768	224
`:convnext_tiny_fb_in22k_ft_in1k_384`	1000	768	384
`:convnext_xlarge_fb_in22k`	21841	2048	224
`:convnext_xlarge_fb_in22k_ft_in1k`	1000	2048	224
`:convnext_xlarge_fb_in22k_ft_in1k_384`	1000	2048	384

Variant	num_classes	num_features	input size
`:convnextv2_atto_fcmae`	0	320	224
`:convnextv2_atto_fcmae_ft_in1k`	1000	320	224
`:convnextv2_base_fcmae`	0	1024	224
`:convnextv2_base_fcmae_ft_in1k`	1000	1024	224
`:convnextv2_base_fcmae_ft_in22k_in1k`	1000	1024	224
`:convnextv2_base_fcmae_ft_in22k_in1k_384`	1000	1024	384
`:convnextv2_femto_fcmae`	0	384	224
`:convnextv2_femto_fcmae_ft_in1k`	1000	384	224
`:convnextv2_huge_fcmae`	0	2816	224
`:convnextv2_huge_fcmae_ft_in1k`	1000	2816	224
`:convnextv2_huge_fcmae_ft_in22k_in1k_384`	1000	2816	384
`:convnextv2_huge_fcmae_ft_in22k_in1k_512`	1000	2816	512
`:convnextv2_large_fcmae`	0	1536	224
`:convnextv2_large_fcmae_ft_in1k`	1000	1536	224
`:convnextv2_large_fcmae_ft_in22k_in1k`	1000	1536	224
`:convnextv2_large_fcmae_ft_in22k_in1k_384`	1000	1536	384
`:convnextv2_nano_fcmae`	0	640	224
`:convnextv2_nano_fcmae_ft_in1k`	1000	640	224
`:convnextv2_nano_fcmae_ft_in22k_in1k`	1000	640	224
`:convnextv2_nano_fcmae_ft_in22k_in1k_384`	1000	640	384
`:convnextv2_pico_fcmae`	0	512	224
`:convnextv2_pico_fcmae_ft_in1k`	1000	512	224
`:convnextv2_tiny_fcmae`	0	768	224
`:convnextv2_tiny_fcmae_ft_in1k`	1000	768	224
`:convnextv2_tiny_fcmae_ft_in22k_in1k`	1000	768	224
`:convnextv2_tiny_fcmae_ft_in22k_in1k_384`	1000	768	384

Variant	num_classes	num_features	input size
`:vgg11_bn_tv_in1k`	1000	512	224
`:vgg11_tv_in1k`	1000	512	224
`:vgg13_bn_tv_in1k`	1000	512	224
`:vgg13_tv_in1k`	1000	512	224
`:vgg16_bn_tv_in1k`	1000	512	224
`:vgg16_tv_in1k`	1000	512	224
`:vgg19_bn_tv_in1k`	1000	512	224
`:vgg19_tv_in1k`	1000	512	224

Variant	num_classes	num_features	input size
`:coatnet_0_rw_224_sw_in1k`	1000	768	224
`:coatnet_1_rw_224_sw_in1k`	1000	768	224
`:coatnet_2_rw_224_sw_in12k`	11821	1024	224
`:coatnet_2_rw_224_sw_in12k_ft_in1k`	1000	1024	224
`:coatnet_3_rw_224_sw_in12k`	11821	1536	224