Java Client Libraries

Vespa provides several Java client libraries for different use cases. This page covers the general-purpose Java clients included in the vespaclient-java module.

Overview

The Vespa Java client libraries provide command-line tools and APIs for:

vespa-feeder: Feed documents from files
vespa-get: Retrieve documents by ID
vespa-visit: Visit and export documents
vespa-stat: Show content cluster statistics

For high-performance bulk feeding in Java applications, use the Feed Client instead of these utilities.

Installation

Add the dependency to your Maven project:

pom.xml

<dependency>
  <groupId>com.yahoo.vespa</groupId>
  <artifactId>vespaclient-java</artifactId>
  <version>${vespa.version}</version>
</dependency>

For Gradle:

build.gradle

implementation 'com.yahoo.vespa:vespaclient-java:${vespa.version}'

vespa-feeder

The vespa-feeder tool feeds documents from XML or JSON files to Vespa.

Basic Usage

# Feed from XML file
vespa-feeder documents.xml

# Feed from JSON file
vespa-feeder documents.json

# Feed with custom route
vespa-feeder --route default documents.xml

# Feed with timeout
vespa-feeder --timeout 180 documents.xml

XML Format

Documents in XML format:

<?xml version="1.0" encoding="utf-8"?>
<vespafeed>
  <document documenttype="music" documentid="id:music:music::1">
    <title>Bohemian Rhapsody</title>
    <artist>Queen</artist>
    <year>1975</year>
  </document>
  
  <update documenttype="music" documentid="id:music:music::1">
    <assign field="year">1975</assign>
  </update>
  
  <remove documentid="id:music:music::1"/>
</vespafeed>

JSON Format

Documents in JSON format (same as Document API):

[
  {
    "put": "id:music:music::1",
    "fields": {
      "title": "Bohemian Rhapsody",
      "artist": "Queen",
      "year": 1975
    }
  },
  {
    "update": "id:music:music::1",
    "fields": {
      "year": { "assign": 1975 }
    }
  },
  {
    "remove": "id:music:music::1"
  }
]

Options

vespa-feeder Options

Option	Description	Default
`--abortondataerror <bool>`	Abort on data errors	true
`--abortonsenderror <bool>`	Abort on send errors	true
`--file <file>`	Input file to read	stdin
`--maxpending <num>`	Max pending operations	1000
`--maxpendingsize <bytes>`	Max pending operation size	1MB
`--route <route>`	Route to use	default
`--timeout <seconds>`	Timeout for operations	180
`--trace <level>`	Trace level (0-9)	0

vespa-get

Retrieve documents by their document ID.

Basic Usage

# Get a single document
vespa-get id:music:music::1

# Get multiple documents
vespa-get id:music:music::1 id:music:music::2 id:music:music::3

# Get with specific field set
vespa-get --fieldset "music:title,artist" id:music:music::1

# Get and show full document
vespa-get --printids id:music:music::1

Output Format

By default, outputs JSON:

{
  "id": "id:music:music::1",
  "fields": {
    "title": "Bohemian Rhapsody",
    "artist": "Queen",
    "year": 1975
  }
}

Options

vespa-get Options

Option	Description
`--fieldset <fields>`	Fields to retrieve
`--printids`	Print document IDs
`--jsonoutput`	Output in JSON format
`--xmloutput`	Output in XML format
`--cluster <name>`	Content cluster name
`--route <route>`	Route to use
`--timeout <seconds>`	Operation timeout

vespa-visit

Visit (iterate over) documents in a content cluster.

Basic Usage

# Visit all documents
vespa-visit

# Visit specific document type
vespa-visit --datahandler music

# Visit with selection
vespa-visit --selection "music.year > 1980"

# Visit and export to file
vespa-visit --datahandler music > export.json

# Visit specific bucket
vespa-visit --bucketstovisit "0x0000000000000001"

Selection Expressions

Filter documents during visit:

# Year range
vespa-visit --selection "music.year >= 1970 AND music.year < 1980"

# String matching
vespa-visit --selection "music.artist == 'Queen'"

# Field existence
vespa-visit --selection "music.rating > 0"

Export and Backup

Use vespa-visit for backups:

# Export all documents
vespa-visit --datahandler music > backup-$(date +%Y%m%d).json

# Export specific time range
vespa-visit --selection "music.timestamp > 1640000000" > recent.json

# Count documents
vespa-visit --statistics | grep "Documents visited"

Options

vespa-visit Options

Option	Description	Default
`--datahandler <type>`	Document type to visit	all
`--selection <expr>`	Document selection	none
`--from <timestamp>`	Visit from timestamp	0
`--to <timestamp>`	Visit to timestamp	now
`--fieldset <fields>`	Fields to retrieve	all
`--cluster <name>`	Content cluster	default
`--maxpending <num>`	Max pending operations	1
`--maxbucketstovisit <num>`	Max buckets	unlimited
`--bucketstovisit <list>`	Specific buckets	all
`--statistics`	Show statistics	false

vespa-stat

Show statistics about content clusters.

Basic Usage

# Show cluster statistics
vespa-stat

# Show specific cluster
vespa-stat --cluster music

# Show with document counts
vespa-stat --user music

Output Example

Cluster: music
  Nodes: 4
  Documents: 1000000
  Disk usage: 45.2 GB
  Memory usage: 8.4 GB
  Active documents: 1000000
  Ready documents: 1000000
  Removed documents: 0

Using in Java Applications

Document Operations

While dedicated feed clients are recommended for production use, you can use the HTTP client directly:

import com.yahoo.vespa.http.client.*;
import java.net.http.*;
import java.net.URI;

public class VespaClient {
    private final HttpClient client;
    private final String endpoint;
    
    public VespaClient(String endpoint) {
        this.endpoint = endpoint;
        this.client = HttpClient.newBuilder()
            .version(HttpClient.Version.HTTP_2)
            .build();
    }
    
    public void putDocument(String docId, String json) throws Exception {
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(endpoint + "/document/v1/" + docId))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(json))
            .build();
            
        HttpResponse<String> response = client.send(request,
            HttpResponse.BodyHandlers.ofString());
            
        if (response.statusCode() != 200) {
            throw new RuntimeException("Failed: " + response.body());
        }
    }
    
    public String getDocument(String docId) throws Exception {
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(endpoint + "/document/v1/" + docId))
            .GET()
            .build();
            
        HttpResponse<String> response = client.send(request,
            HttpResponse.BodyHandlers.ofString());
            
        return response.body();
    }
}

For production applications with high throughput requirements, use the Vespa Feed Client which provides optimized performance, automatic retries, and better error handling.

Error Handling

Common Errors

Connection Refused

The Vespa endpoint is not reachable. Check:

Is Vespa running?
Is the correct host/port specified?
Are firewall rules blocking access?

vespa status

Document Type Not Found

The document type doesn’t exist in the schema:

Verify schema deployment
Check document type name spelling
Ensure application is deployed

vespa deploy

Timeout Errors

Operations are taking too long:

Increase timeout: --timeout 300
Check cluster health: vespa-stat
Reduce batch size

Performance Considerations

The vespaclient-java tools are designed for operational tasks and moderate data volumes. For high-performance feeding:

Use the Vespa Feed Client for Java applications
Use the Vespa CLI feed command for command-line feeding
Both provide significantly better throughput with automatic throttling and parallelism

When to Use Each Tool

Tool	Best For	Throughput
vespa-feeder	Small datasets, testing	Low
vespa-get	Individual document retrieval	N/A
vespa-visit	Backups, exports, iteration	Medium
Feed Client	Production bulk feeding	High
CLI feed	Command-line bulk feeding	High

Next Steps

Feed Client

High-performance Java feed client

Vespa CLI

Modern command-line interface

Document API

Document operations reference

Data Operations

Document API operations guide

HTTP APIs

Java APIs

Client Libraries

Java Client Libraries

Overview

Installation

vespa-feeder

Basic Usage

XML Format

JSON Format

Options

vespa-get

Basic Usage

Output Format

Options

vespa-visit

Basic Usage

Selection Expressions

Export and Backup

Options

vespa-stat

Basic Usage

Output Example

Using in Java Applications

Document Operations

Error Handling

Common Errors

Performance Considerations

When to Use Each Tool

Next Steps

Feed Client

Vespa CLI

Document API

Data Operations

Build docs developers (and LLMs) love

HTTP APIs

Java APIs

Client Libraries

​Overview

​Installation

​vespa-feeder

​Basic Usage

​XML Format

​JSON Format

​Options

​vespa-get

​Basic Usage

​Output Format

​Options

​vespa-visit

​Basic Usage

​Selection Expressions

​Export and Backup

​Options

​vespa-stat

​Basic Usage

​Output Example

​Using in Java Applications

​Document Operations

​Error Handling

​Common Errors

​Performance Considerations

​When to Use Each Tool

​Next Steps

Feed Client

Vespa CLI

Document API

Data Operations

Build docs developers (and LLMs) love

Overview

Installation

vespa-feeder

Basic Usage

XML Format

JSON Format

Options

vespa-get

Basic Usage

Output Format

Options

vespa-visit

Basic Usage

Selection Expressions

Export and Backup

Options

vespa-stat

Basic Usage

Output Example

Using in Java Applications

Document Operations

Error Handling

Common Errors

Performance Considerations

When to Use Each Tool

Next Steps