Signature
Parameters
The model instance to load weights into. Must be a CLIP, CustomTextCLIP, or CoCa model.
Path to the checkpoint file. Supported formats:
- PyTorch checkpoint (
.pt,.pth,.bin) - SafeTensors (
.safetensors) - NumPy/big_vision format (
.npz,.npy) for SigLIP weights
If True, enforces that the keys in the checkpoint exactly match the model’s state dict. Set to False to allow partial weight loading or when loading weights with different key names.
Use weights_only=True for torch.load (safer, prevents arbitrary code execution). Only applies to PyTorch checkpoint formats.
Device to load checkpoint tensors onto initially. Usually
'cpu' to avoid OOM issues during loading.Returns
Dictionary containing information about incompatible keys:
missing_keys: List of keys in the model but not in checkpointunexpected_keys: List of keys in checkpoint but not in model
{} if loading from NumPy/big_vision format.Example
Checkpoint Format Handling
The function automatically handles various checkpoint formats:PyTorch Checkpoints
Module Prefix Removal
Automatically removes'module.' prefix from keys (common in distributed training):
Position Embedding Resizing
Automatically resizes position embeddings if model and checkpoint have different sizes:State Dict Conversion
Automatically converts state dicts from various sources:- OpenAI CLIP format
- Hugging Face transformers format
- timm format
- OpenCLIP legacy format
Notes
- For SafeTensors format, the
safetensorspackage must be installed:pip install safetensors - For NumPy/big_vision format (SigLIP), weights are loaded directly without returning incompatible keys
- The function handles mismatches in
logit_scaleandlogit_biastensor shapes automatically - Position embeddings for both image and text are automatically resized if needed
See Also
- create_model - Create a model with automatic checkpoint loading
- create_model_from_pretrained - High-level function that includes checkpoint loading
