Vespa Query Language (YQL) is a SQL-like language used to express search queries in Vespa. YQL provides a powerful and flexible way to query your data with support for boolean operators, filters, and advanced search features.
Basic Syntax
YQL queries follow this basic structure:
select < fields > from < sources > where < condition > [order by <fields>] [limit <n>] [offset <n>] [timeout <ms>]
Simple Query Example
select * from sources * where title contains "vespa"
This query searches for documents where the title field contains the term “vespa”.
Query Operators
Text Matching
YQL supports multiple text matching operators:
Contains
Matches (Phrase)
Prefix
Fuzzy
select * from sources * where title contains "search engine"
Boolean Operators
Combine multiple conditions using boolean operators:
select * from sources * where
title contains "vespa"
and description contains "search"
or category contains "database"
and !( status contains "deprecated" )
Numeric and Range Queries
// Exact match
select * from sources * where price = 100
// Range query
select * from sources * where range (price, 50 , 200 )
// Less than / Greater than
select * from sources * where price < 100 and rating > 4 . 5
Field Selection
Control which fields are returned in the results:
// Return all fields
select * from sources *
// Return specific fields
select title, price, rating from sources *
// Return document ID only
select documentid from sources *
Ordering and Pagination
Order By
Sort results by one or more fields:
select * from sources *
where title contains "vespa"
order by price desc , rating asc
Limit and Offset
Paginate results:
select * from sources *
where title contains "vespa"
limit 20 offset 40
Advanced Queries
WeakAnd for Large Result Sets
Use weakAnd for efficient queries that match many documents:
select * from sources *
where weakAnd(title contains "vespa" , description contains "search" )
limit 10
From the source code (container-search/src/main/java/com/yahoo/search/Query.java:44-46):
YQL queries are parsed by YqlParser
Serialized using VespaSerializer
Support structured boolean trees and natural language text
Annotations
Add query annotations for fine-grained control:
select * from sources *
where title contains ({stem: false, ranked: false} "Vespa" )
Available annotations include:
stem: Control stemming (true/false)
ranked: Include term in ranking (true/false)
prefix: Enable prefix matching
weight: Set term weight
Embedded Queries
User Query
Delegate query parsing to Vespa’s query parser:
select * from sources * where userQuery()
This allows end-users to write simple queries without YQL syntax.
Combining with YQL
select * from sources *
where userQuery() and range (price, 0 , 100 )
Timeout
Set query timeout in milliseconds:
select * from sources *
where title contains "vespa"
timeout 5000
Sources
Specify which schemas/document types to query:
// Query all sources
select * from sources *
// Query specific schema
select * from product
// Query multiple schemas
select * from sources product, review
Grouping Integration
YQL queries can include grouping operations (see Grouping & Aggregation for details):
select * from sources *
where title contains "vespa"
| all( group (category) each( output ( count ())))
Best Practices
Use specific sources
Query specific schemas instead of sources * for better performance
Add filters early
Apply restrictive filters in the WHERE clause to reduce the result set
Set appropriate limits
Use reasonable limit values to avoid retrieving too many results
Use weakAnd for large result sets
When queries match many documents, weakAnd provides better performance than regular boolean operators
YQL Parser Implementation
Vespa’s YQL implementation (container-search/src/main/java/com/yahoo/search/yql/YqlParser.java) supports:
Multiple query item types: WordItem, PhraseItem, AndItem, OrItem, NotItem
Numeric operators: IntItem, RangeItem, NumericInItem
Advanced items: NearestNeighborItem, FuzzyItem, RegExpItem, WeakAndItem, WandItem
Geo-location queries: GeoLocationItem
Predicate queries: PredicateQueryItem
YQL queries are case-sensitive for field names and operators. Always use the exact field names defined in your schema.
Next Steps