Skip to main content
OpenCLIP provides tools to easily upload your trained models to the Hugging Face Hub. This makes your models discoverable, shareable, and easy to load for others using the OpenCLIP library.

Overview

The push_to_hf_hub module provides:
  • Command-line tool for uploading models
  • Python API for programmatic uploads
  • Automatic configuration file generation
  • Model card creation
  • Support for safetensors format

Installation

Ensure you have the required dependencies:
pip install huggingface_hub safetensors
Login to Hugging Face:
huggingface-cli login
Or provide a token directly in the command.

Command-Line Usage

Use the push_to_hf_hub module as a command-line tool:
python -m open_clip.push_to_hf_hub \
    --model MODEL_NAME \
    --pretrained PRETRAINED_PATH_OR_TAG \
    --repo-id YOUR_HF_USERNAME/MODEL_REPO_NAME

Required Parameters

  • --model: Name of the model architecture (e.g., ViT-B-32, ViT-L-14)
  • --pretrained: Path to checkpoint file or pretrained tag
  • --repo-id: Hugging Face Hub repository ID (format: username/repo-name)

Optional Parameters

  • --precision: Model precision (fp32, fp16, bf16) - default: fp32
  • --image-mean: Override image mean values for preprocessing
  • --image-std: Override image std values for preprocessing
  • --image-interpolation: Image resize interpolation method (bicubic, bilinear)
  • --image-resize-mode: Image resize mode (shortest, longest, squash)
  • --hf-tokenizer-self: Make tokenizer config point to the uploaded model itself

Examples

Example 1: Upload Trained Model

Upload a model you trained locally:
python -m open_clip.push_to_hf_hub \
    --model ViT-B-32 \
    --pretrained /path/to/checkpoints/epoch_32.pt \
    --repo-id myusername/my-clip-model

Example 2: Upload with Custom Preprocessing

Upload a model with custom preprocessing parameters:
python -m open_clip.push_to_hf_hub \
    --model ViT-L-14 \
    --pretrained /path/to/checkpoint.pt \
    --repo-id myusername/vitl14-custom \
    --image-mean 0.5 0.5 0.5 \
    --image-std 0.5 0.5 0.5 \
    --image-interpolation bicubic

Example 3: Re-upload Existing Model

Re-upload an existing OpenCLIP model to your Hub:
python -m open_clip.push_to_hf_hub \
    --model convnext_large_d_320 \
    --pretrained laion2b_s29b_b131k_ft \
    --repo-id myusername/CLIP-convnext_large_d_320
This example is from the README - uploading a ConvNeXt model trained on LAION-2B.

Example 4: Upload with Self-Referencing Tokenizer

Upload a model with custom tokenizer that references itself:
python -m open_clip.push_to_hf_hub \
    --model roberta-ViT-B-32 \
    --pretrained /path/to/checkpoint.pt \
    --repo-id myusername/roberta-clip \
    --hf-tokenizer-self
The --hf-tokenizer-self flag makes the tokenizer configuration point to the uploaded model repository instead of the original tokenizer source.

Python API

You can also upload models programmatically:

Basic Upload

from open_clip.push_to_hf_hub import push_pretrained_to_hf_hub

push_pretrained_to_hf_hub(
    model_name='ViT-B-32',
    pretrained='/path/to/checkpoint.pt',
    repo_id='myusername/my-clip-model',
    commit_message='Upload trained CLIP model',
)

Upload with Custom Configuration

from open_clip.push_to_hf_hub import push_pretrained_to_hf_hub

push_pretrained_to_hf_hub(
    model_name='ViT-L-14',
    pretrained='/path/to/epoch_32.pt',
    repo_id='myusername/vitl14-laion400m',
    precision='fp16',
    image_mean=(0.48145466, 0.4578275, 0.40821073),
    image_std=(0.26862954, 0.26130258, 0.27577711),
    image_interpolation='bicubic',
    commit_message='Add ViT-L/14 trained on LAION-400M',
    private=False,
)

Upload with Model Card

from open_clip.push_to_hf_hub import push_pretrained_to_hf_hub

model_card = {
    'description': 'CLIP ViT-B/32 trained on CC12M dataset',
    'details': {
        'Dataset': 'CC12M',
        'Architecture': 'ViT-B/32',
        'Training samples': '12M',
        'Epochs': '32',
    },
    'usage': """
## Usage

```python
import open_clip
import torch
from PIL import Image

model, _, preprocess = open_clip.create_model_and_transforms(
    'hf-hub:myusername/my-clip-model'
)
model.eval()

image = preprocess(Image.open('image.jpg')).unsqueeze(0)
text = open_clip.tokenize(['a photo of a cat', 'a photo of a dog'])

with torch.no_grad():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    # ... rest of inference code
""", ‘license’: ‘mit’, } push_pretrained_to_hf_hub( model_name=‘ViT-B-32’, pretrained=‘/path/to/checkpoint.pt’, repo_id=‘myusername/my-clip-model’, model_card=model_card, )

## Advanced Usage

### Save Model Locally First

You can save the model files locally before uploading:

```python
import open_clip
from open_clip.push_to_hf_hub import save_for_hf
from pathlib import Path

# Load model
model, _, preprocess = open_clip.create_model_and_transforms(
    'ViT-B-32',
    pretrained='/path/to/checkpoint.pt'
)

# Get model config and tokenizer
model_config = open_clip.get_model_config('ViT-B-32')
tokenizer = open_clip.get_tokenizer('ViT-B-32')

# Save to local directory
save_directory = Path('./model_for_hub')
save_for_hf(
    model=model,
    tokenizer=tokenizer,
    model_config=model_config,
    save_directory=save_directory,
    safe_serialization='both',  # Save both .safetensors and .bin
)

print(f'Model saved to {save_directory}')
print('Files:', list(save_directory.glob('*')))

Manual Upload with Custom Files

from huggingface_hub import HfApi
from pathlib import Path

api = HfApi()

# Upload entire directory
api.upload_folder(
    folder_path='./model_for_hub',
    repo_id='myusername/my-clip-model',
    repo_type='model',
    commit_message='Upload CLIP model'
)

What Gets Uploaded

When you push a model to the Hub, the following files are created:

Model Weights

  • open_clip_pytorch_model.bin: PyTorch weights (pickle format)
  • open_clip_model.safetensors: SafeTensors weights (recommended)

Configuration

  • open_clip_config.json: Model architecture and preprocessing configuration
{
  "model_cfg": {
    "embed_dim": 512,
    "vision_cfg": {...},
    "text_cfg": {...}
  },
  "preprocess_cfg": {
    "mean": [0.48145466, 0.4578275, 0.40821073],
    "std": [0.26862954, 0.26130258, 0.27577711],
    "interpolation": "bicubic",
    "resize_mode": "shortest"
  }
}

Tokenizer Files

  • tokenizer_config.json: Tokenizer configuration
  • vocab.json, merges.txt: Tokenizer vocabulary (for BPE tokenizers)
  • Other tokenizer-specific files

Model Card

  • README.md: Automatically generated model card with metadata

Loading Uploaded Models

Once uploaded, anyone can load your model:
import open_clip

# Load from Hub
model, _, preprocess = open_clip.create_model_and_transforms(
    'hf-hub:myusername/my-clip-model'
)

# Get tokenizer
tokenizer = open_clip.get_tokenizer('hf-hub:myusername/my-clip-model')
See the Loading Models guide for more details.

Model Card Customization

Create comprehensive model cards for better documentation:
model_card = {
    'description': 'Detailed description of your model',
    'details': {
        'Model Type': 'Contrastive Vision-Language Model',
        'Architecture': 'ViT-B/32',
        'Dataset': 'Custom dataset description',
        'Training Samples': '10M image-text pairs',
        'Training Duration': '7 days on 8x A100',
        'Preprocessing': 'Standard CLIP preprocessing',
    },
    'usage': 'Code examples...',
    'comparison': 'Performance comparison with other models...',
    'license': 'mit',
    'citation': r"""
@software{my_clip_model,
  title={My CLIP Model},
  author={Your Name},
  year={2024},
  url={https://huggingface.co/myusername/my-clip-model}
}
""",
}

Best Practices

  1. Use Descriptive Repo Names
    myusername/CLIP-ViT-B-32-CC12M-32epochs
    myusername/CLIP-convnext-large-LAION400M
    
  2. Include Training Information
    • Dataset name and size
    • Training duration
    • Key hyperparameters
    • Performance metrics
  3. Provide Usage Examples
    • Include code snippets in model card
    • Show both inference and fine-tuning
    • Document any special requirements
  4. Use SafeTensors Format
    safe_serialization='both'  # Upload both formats
    
  5. Version Your Models
    • Use tags or branches for different versions
    • Document changes between versions
  6. Test Before Uploading
    # Test loading locally saved model
    model, _, preprocess = open_clip.create_model_and_transforms(
        'local-dir:./model_for_hub'
    )
    
  7. Add Relevant Tags
    model_card = {
        'tags': ['clip', 'vision', 'text', 'multimodal', 'zero-shot'],
        ...
    }
    

Troubleshooting

Authentication Error

HTTPError: 401 Client Error: Unauthorized
Solution: Login to Hugging Face Hub
huggingface-cli login

Repository Already Exists

HTTPError: 409 Client Error: Conflict
Solution: The repository name is already taken. Choose a different name or use your existing repo.

Large File Upload Issues

OSError: File is too large
Solution: Ensure git-lfs is installed
sudo apt-get install git-lfs
git lfs install

Missing Configuration

RuntimeError: Model config not found
Solution: Ensure the model name is correct and the config exists:
import open_clip
print(open_clip.list_models())  # Check available models

Tokenizer Issues

ValueError: Tokenizer type not recognized
Solution: For custom tokenizers, ensure the tokenizer files are included or use --hf-tokenizer-self.

Example Workflow

Complete workflow from training to Hub upload:
#!/bin/bash

# 1. Train model
python -m open_clip_train.main \
    --model ViT-B-32 \
    --train-data "/data/train.tar" \
    --batch-size 256 \
    --epochs 32 \
    --logs ./logs \
    --name my-clip-training

# 2. Find best checkpoint
ls -lh ./logs/my-clip-training/checkpoints/

# 3. Upload to Hub
python -m open_clip.push_to_hf_hub \
    --model ViT-B-32 \
    --pretrained ./logs/my-clip-training/checkpoints/epoch_32.pt \
    --repo-id myusername/clip-vitb32-custom

# 4. Test loading from Hub
python -c "
import open_clip
model, _, preprocess = open_clip.create_model_and_transforms(
    'hf-hub:myusername/clip-vitb32-custom'
)
print('Successfully loaded model from Hub!')
"

Additional Resources

Build docs developers (and LLMs) love