Core API Reference
Complete API reference for the core retrieval functions:ingest() and similaritySearch().
ingest()
Ingest documents from a connector into a vector store.Signature
Parameters
config
Type:IngestionConfig
Ingestion configuration object.
Connector
Connector that provides documents to ingest. See Connectors.
Store
Vector store for saving embeddings. See Stores API.
Embedder
Function that converts text to embeddings. See Embeddings.
Splitter (optional)
Custom text splitting function. Default: MarkdownTextSplitter.
callback
Type:(documentId: string) => void (optional)
Callback invoked for each processed document.
Returns
Type:Promise<void>
Resolves when ingestion completes.
Example
Source Code
Location:/home/daytona/workspace/source/packages/retrieval/src/lib/ingest.ts:18-54
similaritySearch()
Search for relevant documents using semantic similarity.Signature
Parameters
query
Type:string
Natural language search query.
config
Type:Omit<IngestionConfig, 'splitter'>
Search configuration (same as ingestion, without splitter).
Returns
Type:Promise<SearchResult[]>
Array of search results sorted by similarity (highest first).
Example
Automatic Ingestion
The function automatically handles ingestion based onconnector.ingestWhen:
contentChanged(default) - Always attempts ingestion, skips unchangednever- Only ingests if source doesn’t existexpired- Only ingests if source expired or doesn’t exist
Top N Results
Default returns top 50 results. Controlled by store implementation.Source Code
Location:/home/daytona/workspace/source/packages/retrieval/src/lib/similiarty-search.ts:5-56
Type Definitions
Splitter
documentId- Document identifiercontent- Document text
- Array of text chunks
Example Splitter
Built-in Splitters
splitTypeScript()
TypeScript-aware text splitting.- Chunk size: 512 characters
- Chunk overlap: 100 characters
- Language: JavaScript (works for TypeScript)
splitTypeScriptWithPositions()
TypeScript splitting with position tracking.Content ID (CID)
cid()
Generate content identifier using SHA-256 hash.content- Content to hash
- Content identifier (format:
bafkrei...)
Error Handling
Ingestion Errors
Search Errors
Common Errors
Source not found- Connector failed to fetch contentEmbedding failed- Embedder errorDatabase error- Store operation failedInvalid dimensions- Embedder/store dimension mismatch
Performance
Batching
Ingestion automatically batches embeddings:Concurrency
Operations are sequential by default. For parallel ingestion:Next Steps
Connector API
Connector interface reference
Store API
Store interface reference
Ingestion Guide
Learn about ingestion
Search Guide
Learn about search