.sd files.
What is a Schema?
A schema is a declarative specification that defines:- Document structure (fields and types)
- Indexing behavior (how fields are processed)
- Search configuration (how queries match documents)
- Ranking profiles (how results are scored)
Schemas were previously called “search definitions”, which is why the file extension is
.sd.Basic Schema Structure
Here’s a simple schema example:music.sd
Schema Components
Document Block
Thedocument block defines the fields that are stored:
Field Configuration
Each field has several configuration options:Indexing Directive
Indexing Directive
Controls how the field is processed:
index- Create reverse index for text searchsummary- Include in search resultsattribute- Create forward index for ranking, sorting, grouping
|Index Configuration
Index Configuration
Additional indexing options:
enable-bm25- Enable BM25 text rankingenable-embedding- Enable semantic search
Summary Configuration
Summary Configuration
Control how field appears in results:
dynamic- Generate snippets with query term highlightingstatic- Return full field content
Stemming
Stemming
Configure linguistic processing:
best- Use best available stemmer for languagenone- No stemmingshortest,multiple- Stemming variants
Indexing Language
Theindexing directive uses a powerful expression language for field processing:
Indexing Expressions
The indexing language is implemented in the
indexinglanguage module.Attributes
Attributes are forward indexes that enable fast access to field values during ranking and grouping.When to Use Attributes
Ranking
Fields used in rank expressions
Grouping
Fields used for aggregation
Sorting
Fields used to order results
Filtering
Numeric or boolean filters
Attribute Configuration
Field Sets
Field sets group fields for convenient searching:Tensor Fields
Tensor fields store multi-dimensional arrays, essential for machine learning:Tensor fields can be indexed with HNSW (Hierarchical Navigable Small World) for approximate nearest neighbor search.
Document Summaries
Control which fields are returned in search results:Schema Inheritance
Schemas can inherit from other schemas:Real-World Example
Here’s a complete schema for a search application:msmarco.sd
Schema Processing Implementation
Schemas are processed by the config model:
Key Modules:
config-model- Schema parsing and validationindexinglanguage- Indexing expression execution
Best Practices
Index Text Fields
Use
index for fields you’ll search with text queriesAttribute Numeric Fields
Use
attribute for numeric fields used in ranking or filteringEnable BM25
Add
index: enable-bm25 for better text rankingUse Tensors for ML
Store embeddings as tensor fields for semantic search
Next Steps
Search
Learn how search works
Ranking
Configure ranking profiles
Tensors
Work with tensor fields