Skip to main content

Overview

The SearchEngine class handles all search operations for the RAG system, including vector search, keyword search, and hybrid search functionality.

Class Definition

SearchEngine

from agent.rag.search_engine import SearchEngine
from agent.rag.config import Config
from agent.rag.database_manager import DatabaseManager

config = Config()
db_manager = DatabaseManager(config)

search_engine = SearchEngine(config, db_manager)

Constructor

__init__(config, database_manager)

Initialize the search engine with configuration and database access.
config
Config
Configuration object containing search settings
database_manager
DatabaseManager
Database manager for executing queries
Initialization includes:
  • Loading the embedding model (SentenceTransformer)
  • Creating a thread pool executor for CPU-bound embedding operations
  • Configuring hybrid search parameters
from agent.rag.search_engine import SearchEngine
from agent.rag.config import Config
from agent.rag.database_manager import DatabaseManager

# Create dependencies
config = Config()
db_manager = DatabaseManager(config)

# Initialize search engine
search_engine = SearchEngine(config, db_manager)

Core Methods

search_similar_chunks(query, max_results=None, target_quarter=None)

Search for similar chunks using hybrid vector + keyword search with optional quarter filtering.
query
str
Search query text
max_results
int
default:"None"
Maximum number of results to return. If None, uses chunks_per_quarter from config (default: 15)
target_quarter
str
default:"None"
Target quarter for filtering (e.g., “2025_q1”). Use “multiple” or None for no quarter filter
Returns: List[Dict[str, Any]] - List of chunk dictionaries with similarity scores
This is a synchronous method. For async contexts, use the async version of search methods.
# Basic search
results = search_engine.search_similar_chunks(
    query="What is Apple's revenue?",
    max_results=20
)

# Search specific quarter
results = search_engine.search_similar_chunks(
    query="AI strategy",
    target_quarter="2025_q1"
)

# Search all quarters
results = search_engine.search_similar_chunks(
    query="Financial performance",
    target_quarter="multiple"
)

encode_query_async(query)

Generate query embedding asynchronously using thread pool executor.
query
str
Query text to encode
Returns: np.ndarray - Query embedding vector
This method prevents blocking the event loop while encoding queries, making it suitable for async contexts.
import asyncio

async def search_with_embedding():
    # Generate embedding asynchronously
    embedding = await search_engine.encode_query_async(
        "What is Microsoft's cloud revenue?"
    )
    
    # Use embedding for custom search
    # embedding.shape -> (1, 384) or (1, 768) depending on model
    return embedding

result = asyncio.run(search_with_embedding())

Advanced Search Methods

search_with_queries_async(queries, target_quarters, target_quarter, ticker=None)

Run multiple queries in parallel, merge and deduplicate by citation.
queries
List[str]
List of search queries to execute in parallel
target_quarters
List[str]
List of quarters to search (e.g., [‘2025_q1’, ‘2025_q2’])
target_quarter
str
Single quarter identifier or “multiple”
ticker
Optional[str]
default:"None"
Optional ticker symbol to filter results
Returns: List[Dict[str, Any]] - Deduplicated and merged results
import asyncio

async def multi_query_search():
    results = await search_engine.search_with_queries_async(
        queries=[
            "What is Apple's revenue growth?",
            "How did Apple perform financially?",
            "Apple revenue trends"
        ],
        target_quarters=['2024_q4', '2025_q1'],
        target_quarter='multiple',
        ticker='AAPL'
    )
    return results

results = asyncio.run(multi_query_search())

follow_up_search_async(question, has_tickers, is_general_question, is_multi_ticker, tickers_to_process, target_quarter, target_quarters)

Perform a single follow-up search with hybrid search (vector + keyword).
question
str
Follow-up question to search
has_tickers
bool
Whether the query includes ticker symbols
is_general_question
bool
Whether this is a general (non-ticker-specific) question
is_multi_ticker
bool
Whether multiple tickers are involved
tickers_to_process
List[str]
List of ticker symbols to process
target_quarter
Optional[str]
Target quarter or None
target_quarters
List[str]
List of quarters to search
Returns: List[Dict[str, Any]] - Search results
import asyncio

async def follow_up():
    results = await search_engine.follow_up_search_async(
        question="What about operating margins?",
        has_tickers=True,
        is_general_question=False,
        is_multi_ticker=False,
        tickers_to_process=['AAPL'],
        target_quarter='2025_q1',
        target_quarters=['2025_q1']
    )
    return results

results = asyncio.run(follow_up())

follow_up_search_parallel_async(follow_up_questions, …)

Run multiple follow-up questions in parallel, merge and deduplicate by citation.
follow_up_questions
List[str]
List of follow-up questions to execute in parallel
Other parameters: Same as follow_up_search_async Returns: List[Dict[str, Any]] - Deduplicated merged results
import asyncio

async def parallel_follow_up():
    results = await search_engine.follow_up_search_parallel_async(
        follow_up_questions=[
            "What about profit margins?",
            "How did expenses change?",
            "What was the guidance?"
        ],
        has_tickers=True,
        is_general_question=False,
        is_multi_ticker=False,
        tickers_to_process=['MSFT'],
        target_quarter='2025_q1',
        target_quarters=['2025_q1']
    )
    return results

results = asyncio.run(parallel_follow_up())

Internal Methods

_search_multiple_quarters_async(query, target_quarters, chunks_per_quarter=None)

Async version of multi-quarter search using parallel execution.
query
str
Search query
target_quarters
List[str]
Quarters to search in parallel
chunks_per_quarter
int
default:"None"
Number of chunks to retrieve per quarter. If None, uses config default
Returns: List[Dict[str, Any]] - Combined and ranked results from all quarters

_search_keywords(query, max_results=None, target_quarter=None)

Search using PostgreSQL full-text search with keywords.
query
str
Search query to extract keywords from
max_results
int
default:"None"
Maximum results. If None, uses keyword_max_results from config (default: 10)
target_quarter
str
default:"None"
Optional quarter filter
Returns: List[Dict[str, Any]] - Keyword search results with ts_rank scores

_search_keywords_with_ticker(query, ticker, target_quarter=None)

Search using PostgreSQL full-text search with keywords and ticker filtering.
query
str
Search query
ticker
str
Ticker symbol to filter by
target_quarter
str
default:"None"
Optional quarter filter
Returns: List[Dict[str, Any]] - Filtered keyword search results

Search Result Structure

Each search result is a dictionary with the following structure:
{
    'chunk_text': str,          # The text content of the chunk
    'similarity': float,        # Similarity score (0-1, higher is better)
    'distance': float,          # Distance metric (lower is better)
    'metadata': dict,           # Additional metadata
    'citation': int,            # Chunk index for citation
    'year': int,                # Year of the data
    'quarter': int,             # Quarter number (1-4)
    'ticker': str,              # Stock ticker symbol
    'search_type': str,         # 'vector', 'keyword', or 'hybrid'
    'char_offset': int,         # Character offset in original document
    'chunk_length': int,        # Length of the chunk
    'source_quarter': str       # Source quarter identifier (for multi-quarter)
}

Hybrid Search Configuration

The search engine combines vector and keyword search results:
hybrid_search_enabled
bool
default:"True"
Enable/disable hybrid search. If False, returns empty results
vector_weight
float
default:"0.7"
Weight for vector search results (0-1)
keyword_weight
float
default:"0.3"
Weight for keyword search results (0-1)
similarity_threshold
float
default:"0.3"
Minimum similarity threshold for results
parallel_retrieval_enabled
bool
default:"True"
Enable parallel execution of vector and keyword searches
embedding_model
str
default:"sentence-transformers/all-MiniLM-L6-v2"
SentenceTransformer model to use for embeddings

Usage Examples

Performance Characteristics

Search Speed

  • Vector Search: Fast with proper database indexing (typically <100ms)
  • Keyword Search: Very fast using PostgreSQL full-text search (<50ms)
  • Hybrid Search: Combined time when parallel retrieval is enabled (~100ms)

Parallel Execution

Multi-Quarter

Quarters are searched in parallel using asyncio.gather

Vector + Keyword

When parallel_retrieval_enabled, both searches run simultaneously

Multi-Query

Multiple queries executed in parallel and deduplicated

Thread Pool

CPU-bound embedding operations run in thread pool to prevent blocking

Deduplication

Results are deduplicated by citation (chunk index) when merging:
  • Keeps the result with the best score (lowest distance)
  • Prevents duplicate chunks in final results
  • Applied in multi-query and follow-up searches

Logging

The search engine provides detailed logging:
import logging

# Enable search logging
logging.basicConfig(level=logging.INFO)

# Specific loggers:
rag_logger = logging.getLogger('rag_system')
rag_logger.setLevel(logging.INFO)
Log messages include:
  • Query embedding generation time
  • Vector search results count
  • Keyword search results count
  • Result combination time
  • Total search time
  • Per-quarter search results

Error Handling

The search engine handles errors gracefully:
  • Embedding Errors: Logged and returns empty results
  • Database Errors: Caught and logged, returns empty results
  • Query Parsing Errors: Logged with warning, attempts fallback
  • Multi-Quarter Failures: Individual quarter failures don’t stop other quarters

RAGAgent

Main agent that orchestrates search operations

Agent

Top-level agent interface

Build docs developers (and LLMs) love