Endpoint
POST /v1/embeddings
Prerequisites
h2oGPT must be started with LangChain enabled (any--langchain_mode other than Disabled) and the embedding model pre-loaded:
The
model field in the request is accepted for API compatibility but is currently ignored. h2oGPT always uses the single embedding model it was started with.Request parameters
Text string or array of text strings to embed. Each string is embedded independently. Arrays return one embedding object per input element.
Accepted for compatibility but unused. The server uses whichever embedding model was loaded at startup.
Output encoding:
"float" returns a list of floating-point numbers; "base64" returns a base64-encoded string.Optional user identifier.
Response
Always
"list".Array of embedding objects, one per input string.
The model identifier used.
Token usage with
prompt_tokens and total_tokens.Examples
Single string
Batch of strings
Supported embedding models
The embedding model is configured at server startup. Common choices include:| Model | Notes |
|---|---|
hkunlp/instructor-large | Default HuggingFace embedding model |
sentence-transformers/all-MiniLM-L6-v2 | Fast, lightweight |
sentence-transformers/all-mpnet-base-v2 | Higher quality |
| OpenAI embeddings | Set --use_openai_embedding=True and provide OPENAI_API_KEY |
--hf_embedding_model=<model-name>.