Signature
Parameters
Model identifier, potentially with schema prefix:
'ViT-B-32': Built-in model name.pretrainedspecifies CLIP weights source (required).'hf-hub:org/repo': Loads config/weights from HuggingFace Hub.pretrainedis IGNORED.'local-dir:/path/to/folder': Loads config/weights from local directory.pretrainedis IGNORED.
Source for CLIP weights (tag or file path) ONLY if
model_name has no schema. If None and schema requires it, will raise an error. Examples: 'openai', 'laion400m_e32', or a file path.Model precision. Options:
'fp32', 'fp16', 'bf16', 'pure_fp16', 'pure_bf16'.Device to load model on. Can be
'cpu', 'cuda', or a torch.device object.If True, JIT compile the model using torch.jit.script.
Force use of QuickGELU activation in model config.
Force use of custom text encoder architecture.
Override image size in model config. Useful for using models at different resolutions than they were trained at.
Override context length in text config.
Override default image normalization mean values (per channel). Example:
(0.48145466, 0.4578275, 0.40821073).Override default image normalization std values (per channel). Example:
(0.26862954, 0.26130258, 0.27577711).Override default interpolation method for image resizing. Options:
'bicubic', 'bilinear', 'nearest'.Override resize mode for inference preprocessing. Options:
'squash': Resize to exact dimensions (may distort aspect ratio)'shortest': Resize shortest edge to target size, then crop'longest': Resize longest edge to target size, then crop
If True, returns
(model, preprocess) tuple. If False, returns only the model.Cache directory for downloads. Defaults to
~/.cache/clip.Use weights_only=True for torch.load (safer, prevents arbitrary code execution).
Additional keyword arguments for model constructor (highest override priority).
Returns
The created model instance with pretrained weights loaded.
Inference preprocessing transform (only returned if
return_transform=True). This is a deterministic transform without augmentation, suitable for validation and inference.