Skip to main content
Vespa provides a comprehensive set of HTTP APIs for interacting with your application. These APIs enable document management, search queries, and application deployment operations.

Available APIs

Vespa exposes the following main HTTP APIs:

Document v1 API

CRUD operations for documents

Search API

Query and retrieve documents

Deploy API

Deploy and manage applications

Base URL Structure

All Vespa HTTP APIs follow a consistent URL structure:
http://<host>:<port>/<api-path>
  • Document API: http://localhost:8080/document/v1/
  • Search API: http://localhost:8080/search/
  • Deploy API: http://localhost:19071/application/v2/

Authentication

Vespa supports multiple authentication mechanisms depending on your deployment:

Vespa Cloud

For Vespa Cloud deployments, all API requests require authentication using:
  • mTLS (Mutual TLS): Certificate-based authentication for production environments
  • API Keys: Token-based authentication for development and CI/CD
curl -H "Authorization: Bearer <api-key>" \
  https://<endpoint>.vespa-app.cloud/search/?yql=...

Self-Hosted Vespa

For self-hosted deployments:
  • Authentication is optional and configured per container
  • Can integrate with custom authentication filters
  • Supports token-based authentication via custom handlers

Request Format

Content Types

Vespa APIs support the following content types:
  • JSON (default): application/json
  • CBOR: application/cbor (binary JSON format for better performance)

Common Headers

Content-Type
string
default:"application/json"
The MIME type of the request body
Accept
string
default:"application/json"
The desired response format. Supports application/json or application/cbor

Response Format

All API responses follow a consistent JSON structure:
{
  "pathId": "/document/v1/namespace/doctype/docid/1",
  "id": "id:namespace:doctype::1",
  "message": "Success"
}

Error Responses

Error responses include detailed information:
{
  "pathId": "/document/v1/namespace/doctype/docid/1",
  "message": "Document not found"
}
pathId
string
The request path that generated the response
message
string
Human-readable message describing the result or error

HTTP Status Codes

Vespa uses standard HTTP status codes:
Status CodeDescription
200 OKRequest succeeded
201 CreatedDocument created successfully
400 Bad RequestInvalid request format or parameters
404 Not FoundDocument or resource not found
412 Precondition FailedTest-and-set condition failed
429 Too Many RequestsRate limit exceeded or system overload
500 Internal Server ErrorServer-side error
504 Gateway TimeoutRequest timeout exceeded
507 Insufficient StorageNo storage space available

Timeouts

All APIs support timeout configuration:
timeout
string
default:"180s"
Request timeout in seconds or with unit suffix (e.g., 5s, 1000ms)
curl "http://localhost:8080/document/v1/mynamespace/music/docid/1?timeout=30s"

Tracing

Enable request tracing for debugging:
tracelevel
integer
default:"0"
Trace level from 0 (off) to 9 (maximum detail)
curl "http://localhost:8080/search/?yql=...&tracelevel=5"
The trace information is included in the response:
{
  "trace": {
    "children": [
      {
        "message": "Invoking chain 'default'"
      }
    ]
  }
}

Rate Limiting

Vespa implements automatic rate limiting to protect system stability:
  • Document API: Queue-based throttling with configurable limits
  • Search API: Thread pool saturation monitoring
  • Deploy API: Per-tenant rate limiting
When rate limits are exceeded, the API returns HTTP 429 (Too Many Requests).

Performance Considerations

Best Practices

  1. Batch Operations: Use visitor operations for bulk reads instead of individual GET requests
  2. Connection Reuse: Use HTTP keep-alive and connection pooling
  3. CBOR Format: Use CBOR for better performance with large responses
  4. Timeout Configuration: Set appropriate timeouts based on operation complexity
  5. Compression: Enable HTTP compression for large payloads

Example: Connection Pooling

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

session = requests.Session()
retries = Retry(total=3, backoff_factor=0.1)
adapter = HTTPAdapter(max_retries=retries, pool_connections=10, pool_maxsize=20)
session.mount('http://', adapter)

response = session.get('http://localhost:8080/search/?yql=...')

Monitoring and Metrics

Vespa exposes API metrics through:
  • Prometheus metrics: Available at /prometheus/v1/values
  • JSON metrics: Available at /state/v1/metrics
Key metrics to monitor:
  • http.status.2xx: Successful requests
  • http.status.4xx: Client errors
  • http.status.5xx: Server errors
  • http.request.latency: Request latency percentiles

Next Steps

Document v1 API

Learn about document CRUD operations

Search API

Explore query capabilities

Deploy API

Deploy your applications

Query Language

Learn YQL query syntax

Build docs developers (and LLMs) love