Ranking - Vespa

Ranking determines the order of search results. Vespa provides a powerful, flexible ranking framework that can scale from simple text ranking to complex machine learning models.

What is Ranking?

Ranking is the process of scoring documents based on their relevance to a query. In Vespa, ranking:

Happens on content nodes after matching
Is highly configurable through rank profiles
Supports multiple ranking phases
Can use machine learning models
Leverages tensor computations for advanced features

Vespa’s ranking framework is implemented in the searchlib module, with the Feature Execution Framework (FEF) at its core.

Rank Profiles

Rank profiles define how documents are scored. They’re declared in your schema:

schema article {
    document article {
        field title type string {
            indexing: index | summary
            index: enable-bm25
        }
        
        field body type string {
            indexing: index | summary
            index: enable-bm25
        }
    }
    
    rank-profile default {
        first-phase {
            expression: nativeRank(title, body)
        }
    }
    
    rank-profile bm25 inherits default {
        first-phase {
            expression: bm25(title) + bm25(body)
        }
    }
}

Based on msmarco.sd:61-71

Multi-Phase Ranking

Vespa supports multi-phase ranking for efficiency:

First Phase

Fast scoring of all matched documents

Second Phase

Expensive scoring of top candidates from first phase

Global Phase (optional)

Re-ranking across all content nodes

Why Multiple Phases?

Efficiency

Avoid expensive computations on irrelevant documents

Scalability

Keep first-phase fast to handle large result sets

Accuracy

Use complex models only on promising candidates

Flexibility

Different strategies per phase

Ranking Phases Example

rank-profile hybrid {
    first-phase {
        expression: bm25(title) + bm25(body)
    }
    
    second-phase {
        rerank-count: 100
        expression: sum(query(query_embedding) * attribute(doc_embedding))
    }
}

This profile:

First phase: Fast BM25 scoring on all matches
Second phase: Expensive embedding similarity on top 100

Rank Features

Rank features are the building blocks of ranking expressions. Vespa provides hundreds of built-in features:

Text Ranking Features

nativeRank

Vespa’s default text ranking:

first-phase {
    expression: nativeRank(title, body)
}

Combines multiple text signals including term frequency and proximity.

bm25

BM25 text ranking:

first-phase {
    expression: bm25(title) + bm25(body)
}

Industry-standard text ranking algorithm. Requires index: enable-bm25 in schema.

fieldMatch

Advanced text matching score:

first-phase {
    expression: fieldMatch(title).completeness * fieldMatch(title).proximity
}

Provides detailed text matching metrics like completeness, proximity, and coverage.

Attribute Features

attribute

Access document field values:

first-phase {
    expression: attribute(popularity) * bm25(title)
}

Read any attribute field for use in ranking.

query

Access query-time values:

first-phase {
    expression: query(user_weight) * attribute(doc_quality)
}

Pass dynamic values from query to ranking.

Tensor Features

Tensor Operations

Perform tensor computations:

function similarity() {
    expression: sum(query(query_embedding) * attribute(doc_embedding))
}

first-phase {
    expression: similarity()
}

Full support for tensor operations in ranking.

Ranking Expressions

Ranking expressions combine features using mathematical operations:

Basic Operations

first-phase {
    expression: bm25(title) + bm25(body)
}

Real-World Ranking Examples

Here are some complete ranking profiles from Vespa’s codebase:

BM25 Text Ranking

rank-profile bm25 inherits default {
    first-phase {
        expression: bm25(title) + bm25(body)
    }
}

From msmarco.sd:67

Semantic Search with Embeddings

rank-profile semantic inherits default {
    function dot_product_title() {
        expression: sum(query(tensor)*attribute(title_embedding))
    }
    
    function dot_product_body() {
        expression: sum(query(tensor)*attribute(body_embedding))
    }
    
    first-phase {
        expression: dot_product_title() + dot_product_body()
    }
    
    ignore-default-rank-features
    
    rank-features {
        rankingExpression(dot_product_title)
        rankingExpression(dot_product_body)
    }
}

From msmarco.sd:73-88

Hybrid BM25 + Semantic

rank-profile hybrid inherits default {
    function dot_product_title() {
        expression: sum(query(tensor)*attribute(title_embedding))
    }
    
    function dot_product_body() {
        expression: sum(query(tensor)*attribute(body_embedding))
    }
    
    first-phase {
        expression: bm25(title) + bm25(body) + 
                   dot_product_title() + dot_product_body()
    }
    
    ignore-default-rank-features
    
    rank-features {
        bm25(title)
        bm25(body)
        rankingExpression(dot_product_title)
        rankingExpression(dot_product_body)
    }
}

From msmarco.sd:124-141

Learned Weights

rank-profile listwise_linear inherits default {
    function dot_product_title() {
        expression: sum(query(tensor)*attribute(title_embedding))
    }
    
    function dot_product_body() {
        expression: sum(query(tensor)*attribute(body_embedding))
    }
    
    first-phase {
        expression: 0.9005951 * bm25(title) + 
                   2.2043643 * bm25(body) + 
                   0.13506432 * dot_product_title() + 
                   0.5840874 * dot_product_body()
    }
    
    ignore-default-rank-features
    
    rank-features {
        bm25(title)
        bm25(body)
        rankingExpression(dot_product_title)
        rankingExpression(dot_product_body)
    }
}

From msmarco.sd:181-198

Feature Execution Framework

The ranking framework is implemented through the Feature Execution Framework (FEF):

// From searchlib/src/vespa/searchlib/fef/rank_program.h
class RankProgram
{
private:
    BlueprintResolver::SP            _resolver;
    vespalib::Stash                  _hot_stash;
    vespalib::Stash                  _cold_stash;
    std::vector<FeatureExecutor *>   _executors;
    
public:
    /**
     * Set up this rank program by creating the needed feature
     * executors and wiring them together. This function will also
     * pre-calculate all constant features.
     **/
    void setup(const MatchData &md,
               const IQueryEnvironment &queryEnv,
               const Properties &featureOverrides = Properties(),
               vespalib::ExecutionProfiler *profiler = nullptr);
               
    /**
     * Obtain the names and storage locations of all seed features for
     * this rank program.
     **/
    FeatureResolver get_seeds(bool unbox_seeds = true) const;
};

From searchlib/src/vespa/searchlib/fef/rank_program.h:30

The RankProgram class creates and wires together feature executors, pre-calculating constant features for efficiency.

Rank Profile Inheritance

Rank profiles can inherit from other profiles:

rank-profile base {
    function text_score() {
        expression: bm25(title) + bm25(body)
    }
}

rank-profile production inherits base {
    first-phase {
        expression: text_score() * attribute(quality_score)
    }
    
    second-phase {
        rerank-count: 50
        expression: sum(query(query_embedding) * attribute(doc_embedding))
    }
}

Machine Learning Models

Vespa supports various ML models in ranking:

ONNX Models

rank-profile ml_ranking {
    function inputs() {
        expression {
            bm25_title: bm25(title),
            bm25_body: bm25(body),
            popularity: attribute(popularity)
        }
    }
    
    first-phase {
        expression: onnx(my_model).score
    }
    
    onnx-model my_model {
        file: models/ranker.onnx
        input inputs: inputs()
        output: score
    }
}

LightGBM / XGBoost

rank-profile lightgbm {
    first-phase {
        expression: bm25(title)
    }
    
    second-phase {
        rerank-count: 100
        expression: lightgbm("my_model.json")
    }
}

Rank Features for Debugging

Expose features for debugging or downstream processing:

rank-profile debug inherits production {
    ignore-default-rank-features
    
    rank-features {
        bm25(title)
        bm25(body)
        attribute(popularity)
        rankingExpression(text_score)
        fieldMatch(title).completeness
    }
}

These features are returned in search results when requested:

{
  "yql": "select * from article where title contains 'vespa'",
  "ranking": "debug",
  "ranking.listFeatures": true
}

Ranking Performance

Optimization Tips

Use First Phase Wisely

Keep first-phase simple - it runs on ALL matched documents

Limit Second Phase

Set appropriate rerank-count (typically 50-1000)

Precompute Features

Store computed values as attributes when possible

Profile Your Ranks

Use ranking.listFeatures to identify slow features

Multi-Phase Strategy

rank-profile efficient {
    // Fast: scores 10,000+ documents
    first-phase {
        expression: bm25(title) * attribute(quality)
    }
    
    // Medium: scores top 200
    second-phase {
        rerank-count: 200
        expression: sum(query(q) * attribute(embedding))
    }
    
    // Expensive: scores top 20 globally
    global-phase {
        rerank-count: 20
        expression: onnx(complex_model).score
    }
}

Best Practices

Start Simple

Begin with BM25, add complexity as needed

Use Functions

Organize complex expressions into reusable functions

Profile Inheritance

Share common logic across profiles

Test Offline

Use rank feature collection for ML training

Ranking Implementation

Key Modules:

searchlib/fef - Feature Execution Framework
searchlib/features - Built-in rank features
eval - Ranking expression evaluation

Next Steps

Tensors

Use tensors in ranking

Search

Understand the matching phase

Schemas

Configure fields for ranking

Get Started

Core Concepts

Search & Query

Data Operations

Machine Learning

Configuration & Deployment

Performance & Operations

​What is Ranking?

​Rank Profiles

​Multi-Phase Ranking

​Why Multiple Phases?

Efficiency

Scalability

Accuracy

Flexibility

​Ranking Phases Example

​Rank Features

​Text Ranking Features

​Attribute Features

​Tensor Features

​Ranking Expressions

​Basic Operations

​Real-World Ranking Examples

​BM25 Text Ranking

​Semantic Search with Embeddings

​Hybrid BM25 + Semantic

​Learned Weights

​Feature Execution Framework

​Rank Profile Inheritance

​Machine Learning Models

​ONNX Models

​LightGBM / XGBoost

​Rank Features for Debugging

​Ranking Performance

​Optimization Tips

Use First Phase Wisely

Limit Second Phase

Precompute Features

Profile Your Ranks

​Multi-Phase Strategy

​Best Practices

Start Simple

Use Functions

Profile Inheritance

Test Offline

​Ranking Implementation

​Next Steps

Tensors

Search

Schemas

Build docs developers (and LLMs) love

What is Ranking?

Rank Profiles

Multi-Phase Ranking

Why Multiple Phases?

Ranking Phases Example

Rank Features

Text Ranking Features

Attribute Features

Tensor Features

Ranking Expressions

Basic Operations

Real-World Ranking Examples

BM25 Text Ranking

Semantic Search with Embeddings

Hybrid BM25 + Semantic

Learned Weights

Feature Execution Framework

Rank Profile Inheritance

Machine Learning Models

ONNX Models

LightGBM / XGBoost

Rank Features for Debugging

Ranking Performance

Optimization Tips

Multi-Phase Strategy

Best Practices

Ranking Implementation

Next Steps