Vertex AI Model Garden is a curated catalog of open-source and proprietary models that you can discover, evaluate, and deploy on Google Cloud. It provides:
Pre-configured Models: Optimized deployment configurations for popular open models
One-Click Deployment: Simplified deployment process via UI, CLI, or SDK
Hugging Face Integration: Access to over 1 million models from the Hugging Face Hub
Version Management: Track and manage different model versions
The Vertex AI Model Garden SDK provides a model-centric interface for deploying open-source models, removing the need to manage container details and infrastructure complexity.
# List Model Garden modelsmg_models = model_garden.list_deployable_models( model_filter="gemma", list_hf_models=False)# Include Hugging Face modelsall_models = model_garden.list_deployable_models( model_filter="gemma", list_hf_models=True)for model_id in all_models: print(f"Available: {model_id}")
3
Deploy a Model
# Create model instancemodel = model_garden.OpenModel("google/gemma3@gemma-3-1b-it")# Deploy to endpointendpoint = model.deploy(accept_eula=True)print(f"Endpoint: {endpoint.resource_name}")
# Search by namegemma_models = model_garden.list_deployable_models( model_filter="gemma", list_hf_models=True)# Search for vision modelsvision_models = model_garden.list_deployable_models( model_filter="stable-diffusion", list_hf_models=True)# List all available modelsall_models = model_garden.list_deployable_models(list_hf_models=True)print(f"Total models available: {len(all_models)}")
Before deploying, review available configurations:
model = model_garden.OpenModel("google/gemma3@gemma-3-1b-it")# List deployment configurationsdeploy_options = model.list_deploy_options(concise=True)print(deploy_options)
Deployment options show the verified machine types, accelerators, and configurations that work best for each model.
from huggingface_hub import interpreter_login# Login to Hugging Faceinterpreter_login()# Or provide token directlymodel = model_garden.OpenModel("black-forest-labs/FLUX.1-dev")endpoint = model.deploy( hugging_face_access_token="hf_your_token_here")
try: model = model_garden.OpenModel("google/some-model@some-version") endpoint = model.deploy()except Exception as e: print(f"Error: {e}") # Check model name spelling and version
try: endpoint = model.deploy( machine_type="g2-standard-4", accelerator_type="NVIDIA_L4" )except Exception as e: if "quota" in str(e).lower(): print("Request quota increase:") print("https://console.cloud.google.com/iam-admin/quotas")
try: model = model_garden.OpenModel("meta/[email protected]") endpoint = model.deploy() # Missing accept_eula=Trueexcept Exception as e: if "eula" in str(e).lower(): # Retry with EULA acceptance endpoint = model.deploy(accept_eula=True)
If deployment is blocked by organization policies:
# Error: Organization Policy constraint violated# Contact your administrator to update policies:# https://cloud.google.com/vertex-ai/generative-ai/docs/control-model-access