Skip to main content
Vespa provides several Java client libraries for different use cases. This page covers the general-purpose Java clients included in the vespaclient-java module.

Overview

The Vespa Java client libraries provide command-line tools and APIs for:
  • vespa-feeder: Feed documents from files
  • vespa-get: Retrieve documents by ID
  • vespa-visit: Visit and export documents
  • vespa-stat: Show content cluster statistics
For high-performance bulk feeding in Java applications, use the Feed Client instead of these utilities.

Installation

Add the dependency to your Maven project:
pom.xml
<dependency>
  <groupId>com.yahoo.vespa</groupId>
  <artifactId>vespaclient-java</artifactId>
  <version>${vespa.version}</version>
</dependency>
For Gradle:
build.gradle
implementation 'com.yahoo.vespa:vespaclient-java:${vespa.version}'

vespa-feeder

The vespa-feeder tool feeds documents from XML or JSON files to Vespa.

Basic Usage

# Feed from XML file
vespa-feeder documents.xml

# Feed from JSON file
vespa-feeder documents.json

# Feed with custom route
vespa-feeder --route default documents.xml

# Feed with timeout
vespa-feeder --timeout 180 documents.xml

XML Format

Documents in XML format:
<?xml version="1.0" encoding="utf-8"?>
<vespafeed>
  <document documenttype="music" documentid="id:music:music::1">
    <title>Bohemian Rhapsody</title>
    <artist>Queen</artist>
    <year>1975</year>
  </document>
  
  <update documenttype="music" documentid="id:music:music::1">
    <assign field="year">1975</assign>
  </update>
  
  <remove documentid="id:music:music::1"/>
</vespafeed>

JSON Format

Documents in JSON format (same as Document API):
[
  {
    "put": "id:music:music::1",
    "fields": {
      "title": "Bohemian Rhapsody",
      "artist": "Queen",
      "year": 1975
    }
  },
  {
    "update": "id:music:music::1",
    "fields": {
      "year": { "assign": 1975 }
    }
  },
  {
    "remove": "id:music:music::1"
  }
]

Options

OptionDescriptionDefault
--abortondataerror <bool>Abort on data errorstrue
--abortonsenderror <bool>Abort on send errorstrue
--file <file>Input file to readstdin
--maxpending <num>Max pending operations1000
--maxpendingsize <bytes>Max pending operation size1MB
--route <route>Route to usedefault
--timeout <seconds>Timeout for operations180
--trace <level>Trace level (0-9)0

vespa-get

Retrieve documents by their document ID.

Basic Usage

# Get a single document
vespa-get id:music:music::1

# Get multiple documents
vespa-get id:music:music::1 id:music:music::2 id:music:music::3

# Get with specific field set
vespa-get --fieldset "music:title,artist" id:music:music::1

# Get and show full document
vespa-get --printids id:music:music::1

Output Format

By default, outputs JSON:
{
  "id": "id:music:music::1",
  "fields": {
    "title": "Bohemian Rhapsody",
    "artist": "Queen",
    "year": 1975
  }
}

Options

OptionDescription
--fieldset <fields>Fields to retrieve
--printidsPrint document IDs
--jsonoutputOutput in JSON format
--xmloutputOutput in XML format
--cluster <name>Content cluster name
--route <route>Route to use
--timeout <seconds>Operation timeout

vespa-visit

Visit (iterate over) documents in a content cluster.

Basic Usage

# Visit all documents
vespa-visit

# Visit specific document type
vespa-visit --datahandler music

# Visit with selection
vespa-visit --selection "music.year > 1980"

# Visit and export to file
vespa-visit --datahandler music > export.json

# Visit specific bucket
vespa-visit --bucketstovisit "0x0000000000000001"

Selection Expressions

Filter documents during visit:
# Year range
vespa-visit --selection "music.year >= 1970 AND music.year < 1980"

# String matching
vespa-visit --selection "music.artist == 'Queen'"

# Field existence
vespa-visit --selection "music.rating > 0"

Export and Backup

Use vespa-visit for backups:
# Export all documents
vespa-visit --datahandler music > backup-$(date +%Y%m%d).json

# Export specific time range
vespa-visit --selection "music.timestamp > 1640000000" > recent.json

# Count documents
vespa-visit --statistics | grep "Documents visited"

Options

OptionDescriptionDefault
--datahandler <type>Document type to visitall
--selection <expr>Document selectionnone
--from <timestamp>Visit from timestamp0
--to <timestamp>Visit to timestampnow
--fieldset <fields>Fields to retrieveall
--cluster <name>Content clusterdefault
--maxpending <num>Max pending operations1
--maxbucketstovisit <num>Max bucketsunlimited
--bucketstovisit <list>Specific bucketsall
--statisticsShow statisticsfalse

vespa-stat

Show statistics about content clusters.

Basic Usage

# Show cluster statistics
vespa-stat

# Show specific cluster
vespa-stat --cluster music

# Show with document counts
vespa-stat --user music

Output Example

Cluster: music
  Nodes: 4
  Documents: 1000000
  Disk usage: 45.2 GB
  Memory usage: 8.4 GB
  Active documents: 1000000
  Ready documents: 1000000
  Removed documents: 0

Using in Java Applications

Document Operations

While dedicated feed clients are recommended for production use, you can use the HTTP client directly:
import com.yahoo.vespa.http.client.*;
import java.net.http.*;
import java.net.URI;

public class VespaClient {
    private final HttpClient client;
    private final String endpoint;
    
    public VespaClient(String endpoint) {
        this.endpoint = endpoint;
        this.client = HttpClient.newBuilder()
            .version(HttpClient.Version.HTTP_2)
            .build();
    }
    
    public void putDocument(String docId, String json) throws Exception {
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(endpoint + "/document/v1/" + docId))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(json))
            .build();
            
        HttpResponse<String> response = client.send(request,
            HttpResponse.BodyHandlers.ofString());
            
        if (response.statusCode() != 200) {
            throw new RuntimeException("Failed: " + response.body());
        }
    }
    
    public String getDocument(String docId) throws Exception {
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(endpoint + "/document/v1/" + docId))
            .GET()
            .build();
            
        HttpResponse<String> response = client.send(request,
            HttpResponse.BodyHandlers.ofString());
            
        return response.body();
    }
}
For production applications with high throughput requirements, use the Vespa Feed Client which provides optimized performance, automatic retries, and better error handling.

Error Handling

Common Errors

The Vespa endpoint is not reachable. Check:
  • Is Vespa running?
  • Is the correct host/port specified?
  • Are firewall rules blocking access?
vespa status
The document type doesn’t exist in the schema:
  • Verify schema deployment
  • Check document type name spelling
  • Ensure application is deployed
vespa deploy
Operations are taking too long:
  • Increase timeout: --timeout 300
  • Check cluster health: vespa-stat
  • Reduce batch size

Performance Considerations

The vespaclient-java tools are designed for operational tasks and moderate data volumes. For high-performance feeding:

When to Use Each Tool

ToolBest ForThroughput
vespa-feederSmall datasets, testingLow
vespa-getIndividual document retrievalN/A
vespa-visitBackups, exports, iterationMedium
Feed ClientProduction bulk feedingHigh
CLI feedCommand-line bulk feedingHigh

Next Steps

Feed Client

High-performance Java feed client

Vespa CLI

Modern command-line interface

Document API

Document operations reference

Data Operations

Document API operations guide

Build docs developers (and LLMs) love