Skip to main content
The Vespa Feed Client is a high-performance Java library for feeding documents to Vespa over HTTP/2. It’s optimized for throughput with automatic throttling, retries, and circuit breaking.

Overview

The Feed Client provides:
  • High throughput: Optimized HTTP/2 multiplexing with automatic concurrency tuning
  • Automatic retries: Configurable retry logic for transient failures
  • Circuit breaking: Fail-fast behavior when the cluster is struggling
  • Async operations: Non-blocking API with CompletableFuture
  • Built-in metrics: Detailed statistics about feed operations
This is the recommended client for production Java applications that need to feed large volumes of data to Vespa.

Installation

Add the Maven dependency:
pom.xml
<dependency>
  <groupId>com.yahoo.vespa</groupId>
  <artifactId>vespa-feed-client</artifactId>
  <version>${vespa.version}</version>
</dependency>
For Gradle:
build.gradle
implementation 'com.yahoo.vespa:vespa-feed-client:${vespa.version}'

Quick Start

Basic Setup

Create a feed client:
import ai.vespa.feed.client.*;
import java.net.URI;
import java.util.concurrent.CompletableFuture;

public class FeedExample {
    public static void main(String[] args) {
        // Create client
        FeedClient client = FeedClientBuilder.create(
            URI.create("https://my-vespa-endpoint.com:8080")
        ).build();
        
        try {
            // Feed a document
            DocumentId docId = DocumentId.of("music", "music", "song-1");
            String json = "{\"fields\":{\"title\":\"Hello World\"}}";
            
            CompletableFuture<Result> promise = client.put(
                docId, 
                json, 
                OperationParameters.empty()
            );
            
            // Wait for result
            Result result = promise.join();
            System.out.println("Document fed: " + result.documentId());
            
        } finally {
            client.close();
        }
    }
}

Multiple Endpoints

Distribute load across multiple endpoints:
List<URI> endpoints = List.of(
    URI.create("https://node1.vespa.example.com:8080"),
    URI.create("https://node2.vespa.example.com:8080"),
    URI.create("https://node3.vespa.example.com:8080")
);

FeedClient client = FeedClientBuilder.create(endpoints)
    .setConnectionsPerEndpoint(4)
    .setMaxStreamPerConnection(128)
    .build();
When providing multiple endpoints, each operation is sent to ONE endpoint (round-robin), not all of them. This is for feeding directly to container nodes without a load balancer.

Document Operations

Put Documents

Insert or replace documents:
DocumentId docId = DocumentId.of("music", "music", "123");

String documentJson = """
{
  "fields": {
    "title": "Bohemian Rhapsody",
    "artist": "Queen",
    "year": 1975
  }
}""";

CompletableFuture<Result> promise = client.put(
    docId,
    documentJson,
    OperationParameters.empty()
);

Update Documents

Partial updates to existing documents:
DocumentId docId = DocumentId.of("music", "music", "123");

String updateJson = """
{
  "fields": {
    "year": { "assign": 1975 },
    "plays": { "increment": 1 }
  }
}""";

CompletableFuture<Result> promise = client.update(
    docId,
    updateJson,
    OperationParameters.empty()
);

Remove Documents

Delete documents:
DocumentId docId = DocumentId.of("music", "music", "123");

CompletableFuture<Result> promise = client.remove(
    docId,
    OperationParameters.empty()
);

Operation Parameters

Customize individual operations:
import ai.vespa.feed.client.OperationParameters;
import java.time.Duration;

// Create parameters
OperationParameters params = OperationParameters.empty()
    .timeout(Duration.ofSeconds(30))
    .route("music/default")        // Custom route
    .tracelevel(5);                 // Enable tracing

// Use in operation
CompletableFuture<Result> promise = client.put(docId, json, params);

Available Parameters

ParameterDescriptionDefault
timeout()Operation timeoutClient default
route()Target route”default”
tracelevel()Trace level (0-9)0
createIfNonExistent()Create on update if missingtrue
testAndSetCondition()Conditional update expressionnone

Handling Results

Synchronous Waiting

Wait for single operations:
try {
    Result result = promise.join();
    System.out.println("Success: " + result.documentId());
} catch (CompletionException e) {
    if (e.getCause() instanceof FeedException) {
        FeedException fe = (FeedException) e.getCause();
        System.err.println("Feed failed: " + fe.getMessage());
    }
}

Batch Waiting

Wait for multiple operations:
import java.util.*;

List<CompletableFuture<Result>> promises = new ArrayList<>();

// Submit many operations
for (int i = 0; i < 10000; i++) {
    DocumentId docId = DocumentId.of("music", "music", String.valueOf(i));
    String json = String.format("{\"fields\":{\"id\":%d}}", i);
    promises.add(client.put(docId, json, OperationParameters.empty()));
}

// Wait for all to complete
try {
    List<Result> results = FeedClient.await(promises);
    System.out.println("All operations completed: " + results.size());
} catch (MultiFeedException e) {
    System.err.println("Some operations failed: " + e.getMessage());
    e.getExceptions().forEach(ex -> {
        System.err.println("  - " + ex.getMessage());
    });
}

Async Processing

Process results asynchronously:
client.put(docId, json, OperationParameters.empty())
    .thenAccept(result -> {
        System.out.println("Success: " + result.documentId());
    })
    .exceptionally(ex -> {
        System.err.println("Failed: " + ex.getMessage());
        return null;
    });

Configuration

Connection Settings

Optimize connection usage:
FeedClient client = FeedClientBuilder.create(endpoint)
    // Number of connections per endpoint
    .setConnectionsPerEndpoint(8)
    
    // HTTP/2 streams per connection
    .setMaxStreamPerConnection(128)
    
    // Connection recycling (0 = disabled)
    .setConnectionTimeToLive(Duration.ofMinutes(30))
    
    .build();
Choose connection settings based on your cluster size:Small cluster (1-5 nodes)
  • Connections: 4-8
  • Streams: 64-128
Medium cluster (5-20 nodes)
  • Connections: 8-16
  • Streams: 128-256
Large cluster (20+ nodes)
  • Connections: 16-32
  • Streams: 256-512
Total inflight = connections × streams per connection

Retry Strategy

Customize retry behavior:
import ai.vespa.feed.client.FeedClient.RetryStrategy;
import ai.vespa.feed.client.FeedClient.OperationType;

class CustomRetryStrategy implements RetryStrategy {
    @Override
    public boolean retry(OperationType type) {
        // Retry all operations except removes
        return type != OperationType.REMOVE;
    }
    
    @Override
    public int retries() {
        // Max 10 retries
        return 10;
    }
}

FeedClient client = FeedClientBuilder.create(endpoint)
    .setRetryStrategy(new CustomRetryStrategy())
    .build();

Circuit Breaker

Implement fail-fast behavior:
import ai.vespa.feed.client.GracePeriodCircuitBreaker;
import java.time.Duration;

// Create circuit breaker
var breaker = new GracePeriodCircuitBreaker(
    Duration.ofSeconds(10),  // Start probing after 10s of failures
    Duration.ofSeconds(20)   // Go OPEN after 20s of failures
);

FeedClient client = FeedClientBuilder.create(endpoint)
    .setCircuitBreaker(breaker)
    .build();

// Check circuit breaker state
CircuitBreaker.State state = client.circuitBreakerState();
if (state == CircuitBreaker.State.OPEN) {
    System.err.println("Circuit breaker is OPEN - cluster is struggling");
}
  • CLOSED: Normal operation, all requests sent
  • HALF_OPEN: Probing with limited traffic to test recovery
  • OPEN: Failing fast, no requests sent to protect cluster
Transitions:
  • CLOSED → HALF_OPEN: After grace period of continuous failures
  • HALF_OPEN → CLOSED: After successful probes
  • HALF_OPEN → OPEN: After doom period if failures persist

SSL/TLS Configuration

Configure mutual TLS:
import java.nio.file.Path;

FeedClient client = FeedClientBuilder.create(endpoint)
    .setCertificate(
        Path.of("/path/to/cert.pem"),
        Path.of("/path/to/key.pem")
    )
    .setCaCertificatesFile(Path.of("/path/to/ca.pem"))
    .build();

Compression

Configure request compression:
import ai.vespa.feed.client.FeedClientBuilder.Compression;

FeedClient client = FeedClientBuilder.create(endpoint)
    .setCompression(Compression.auto)  // auto, gzip, or none
    .build();

Monitoring and Metrics

Feed Statistics

Get real-time statistics:
OperationStats stats = client.stats();

System.out.println("Requests sent: " + stats.requests());
System.out.println("Responses: " + stats.responses());
System.out.println("Errors: " + stats.exceptions());
System.out.println("Avg latency: " + stats.averageLatencyMillis() + "ms");
System.out.println("Min latency: " + stats.minLatencyMillis() + "ms");
System.out.println("Max latency: " + stats.maxLatencyMillis() + "ms");

// Response code distribution
stats.responsesByCode().forEach((code, count) -> {
    System.out.println("  HTTP " + code + ": " + count);
});

Reset Statistics

Filter out warmup operations:
// Warmup phase
for (int i = 0; i < 100; i++) {
    client.put(docId, json, OperationParameters.empty()).join();
}

// Reset before real load
client.resetStats();

// Now run actual benchmark
for (int i = 0; i < 100000; i++) {
    client.put(docId, json, OperationParameters.empty());
}

OperationStats finalStats = client.stats();

Complete Example

Production-ready feed client:
import ai.vespa.feed.client.*;
import java.net.URI;
import java.nio.file.*;
import java.time.Duration;
import java.util.*;
import java.util.concurrent.*;
import java.util.stream.*;

public class ProductionFeeder {
    
    private final FeedClient client;
    
    public ProductionFeeder(URI endpoint, Path certFile, Path keyFile) {
        // Configure circuit breaker
        var breaker = new GracePeriodCircuitBreaker(
            Duration.ofSeconds(10),
            Duration.ofSeconds(30)
        );
        
        // Create client with production settings
        this.client = FeedClientBuilder.create(endpoint)
            .setConnectionsPerEndpoint(16)
            .setMaxStreamPerConnection(256)
            .setConnectionTimeToLive(Duration.ofMinutes(30))
            .setCertificate(certFile, keyFile)
            .setCircuitBreaker(breaker)
            .setCompression(FeedClientBuilder.Compression.auto)
            .build();
    }
    
    public void feedDocuments(Stream<Document> documents) {
        List<CompletableFuture<Result>> promises = new ArrayList<>();
        long startTime = System.currentTimeMillis();
        
        documents.forEach(doc -> {
            DocumentId docId = DocumentId.of(
                doc.namespace(),
                doc.documentType(),
                doc.id()
            );
            
            CompletableFuture<Result> promise = client.put(
                docId,
                doc.toJson(),
                OperationParameters.empty()
            );
            
            promises.add(promise);
            
            // Print progress every 10000 operations
            if (promises.size() % 10000 == 0) {
                printProgress(promises.size(), startTime);
            }
        });
        
        // Wait for completion
        try {
            List<Result> results = FeedClient.await(promises);
            printSummary(results.size(), startTime);
        } catch (MultiFeedException e) {
            System.err.println("Feed completed with errors: " + e.getMessage());
            System.err.println("Failed operations: " + e.getExceptions().size());
            throw e;
        }
    }
    
    private void printProgress(int count, long startTime) {
        long elapsed = System.currentTimeMillis() - startTime;
        double rate = count / (elapsed / 1000.0);
        System.out.printf("Progress: %d documents (%.1f docs/sec)%n", 
                         count, rate);
    }
    
    private void printSummary(int count, long startTime) {
        long elapsed = System.currentTimeMillis() - startTime;
        double rate = count / (elapsed / 1000.0);
        
        OperationStats stats = client.stats();
        
        System.out.println("\n=== Feed Summary ===");
        System.out.printf("Total documents: %d%n", count);
        System.out.printf("Time elapsed: %.2f seconds%n", elapsed / 1000.0);
        System.out.printf("Throughput: %.1f docs/sec%n", rate);
        System.out.printf("Avg latency: %d ms%n", stats.averageLatencyMillis());
        System.out.printf("Min latency: %d ms%n", stats.minLatencyMillis());
        System.out.printf("Max latency: %d ms%n", stats.maxLatencyMillis());
        System.out.println("\nResponse codes:");
        stats.responsesByCode().forEach((code, count) -> 
            System.out.printf("  %d: %d%n", code, count));
    }
    
    public void close() {
        client.close();
    }
    
    // Document record
    record Document(String namespace, String documentType, String id, String toJson) {}
}

Error Handling

Exception Types

Base exception for all feed errors. Contains HTTP response details.
try {
    Result result = promise.join();
} catch (CompletionException e) {
    if (e.getCause() instanceof FeedException fe) {
        System.err.println("HTTP " + fe.code() + ": " + fe.getMessage());
    }
}
Thrown by FeedClient.await() when some operations fail.
try {
    FeedClient.await(promises);
} catch (MultiFeedException e) {
    e.getExceptions().forEach(ex -> {
        System.err.println("Failed: " + ex.getMessage());
    });
}

Retry Logic

The client automatically retries:
  • 503 Service Unavailable: Cluster overload
  • Network errors: Connection failures, timeouts
  • Idempotent operations: All operations by default
Not retried:
  • 400 Bad Request: Invalid document format
  • 404 Not Found: Document type doesn’t exist
  • 412 Precondition Failed: Test-and-set condition failed

Performance Tuning

1

Start with defaults

The client auto-tunes concurrency. Start with default settings and measure throughput.
2

Increase connections

If CPU usage is low, increase connections:
.setConnectionsPerEndpoint(16)
.setMaxStreamPerConnection(256)
3

Monitor circuit breaker

If the circuit breaker opens frequently, your cluster is overloaded. Scale up or reduce feed rate.
4

Optimize document size

Smaller documents = higher throughput. Consider:
  • Removing unnecessary fields
  • Compressing large text fields
  • Using references for shared data

Best Practices

Always close the clientUse try-with-resources or ensure close() is called:
try (FeedClient client = FeedClientBuilder.create(endpoint).build()) {
    // Feed operations
}
Not closing the client may leak resources and prevent graceful shutdown.
Reuse the clientCreate one FeedClient per application, not per request. The client is thread-safe and maintains connection pools.

Source Reference

Key implementation files:
  • FeedClient interface: vespa-feed-client-api/src/main/java/ai/vespa/feed/client/FeedClient.java:22
  • FeedClientBuilder: vespa-feed-client-api/src/main/java/ai/vespa/feed/client/FeedClientBuilder.java:21
  • HTTP implementation: vespa-feed-client/src/main/java/ai/vespa/feed/client/impl/HttpFeedClient.java

Next Steps

Document API

Document JSON format reference

Vespa CLI

Command-line feeding tool

Performance Tuning

Feed performance guide

Monitoring

Monitoring and metrics

Build docs developers (and LLMs) love