Skip to main content

Overview

The retrieve() method implements REMem’s multi-stage retrieval pipeline, combining fact-based graph traversal with dense passage ranking and Personalized PageRank (PPR) re-ranking.

Basic Usage

1

Index Your Documents

First, ensure you’ve indexed your documents:
from remem import ReMem
from remem.utils.config_utils import BaseConfig

config = BaseConfig(
    llm_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2",
    retrieval_top_k=20
)

remem = ReMem(global_config=config, working_dir="./remem_data")
remem.index(docs)
2

Prepare Queries

Format your queries as a list of strings:
queries = [
    "What is machine learning?",
    "How do neural networks work?"
]
3

Retrieve Passages

Call retrieve() to get relevant passages:
results = remem.retrieve(queries, num_to_retrieve=20)

# Access results for first query
first_result = results[0]
print(f"Query: {first_result.question}")
print(f"Top passages: {first_result.docs[:3]}")
print(f"Scores: {first_result.doc_scores[:3]}")

Return Format

The retrieve() method returns a list of QuerySolution objects:
from remem.utils.misc_utils import QuerySolution

# Each QuerySolution contains:
result = results[0]
result.question        # str: Original query
result.docs            # List[str]: Retrieved passages
result.doc_scores      # List[float]: Relevance scores
result.graph_seeds     # List: Facts used as graph seeds
result.doc_metadata    # List[Dict]: Metadata for each passage

Retrieval Pipeline

REMem’s retrieval consists of four stages:

1. Fact Retrieval

Queries are matched against extracted facts using semantic similarity:
# Internally, ReMem:
# 1. Embeds the query
# 2. Retrieves top-k similar facts from the knowledge graph
# 3. Uses facts as seeds for graph traversal

2. Recognition Memory

Optional reranking filters relevant facts:
config = BaseConfig(
    # Enable reranking with trained DSPy filter
    rerank_dspy_file_path="src/remem/prompts/dspy_prompts/filter_llama3.3-70B-Instruct.json"
)

3. Dense Passage Scoring

Passages connected to facts are scored using embeddings:
config = BaseConfig(
    linking_top_k=5,  # Number of entities to link per fact
    retrieval_top_k=200  # Total passages to retrieve
)

4. Personalized PageRank

Graph-based re-ranking propagates scores through the knowledge graph:
config = BaseConfig(
    damping=0.5,  # PPR damping factor (0-1)
    passage_node_weight=0.05  # Weight for passage nodes in PPR
)

Configuring Retrieval

Number of Results

Control how many passages to retrieve:
# Via config (default for all retrievals)
config = BaseConfig(
    retrieval_top_k=200  # Retrieve top 200 passages
)

# Or per-query
results = remem.retrieve(queries, num_to_retrieve=50)

Linking Parameters

Adjust graph traversal depth:
config = BaseConfig(
    linking_top_k=5,  # Entities to link at each retrieval step
    synonymy_edge_topk=2047,  # KNN neighbors for synonym detection
    synonymy_edge_sim_threshold=0.8  # Synonym similarity threshold
)

Graph Weights

Balance passage vs entity importance:
config = BaseConfig(
    passage_node_weight=0.05,  # Multiplicative factor for passage nodes in PPR
    damping=0.5  # Random walk damping (higher = more local)
)

Fallback Behavior

If no relevant facts are found after reranking, REMem automatically falls back to dense passage retrieval (DPR) using only embeddings.
# From remem.py:523-529
if len(top_k_triples) == 0:
    logger.info("No triple found after reranking, return DPR results")
    sorted_chunk_ids, sorted_chunk_scores = self.dense_passage_retrieval(query)
else:
    # Use query-to-triple to search on the graph
    sorted_chunk_ids, sorted_chunk_scores = self.graph_search_with_fact_entities(...)

Complete Example from main.py

main.py
import json
from remem import ReMem
from remem.utils.config_utils import BaseConfig

# Load dataset
corpus = json.load(open("reproduce/dataset/musique_corpus.json", "r"))
samples = json.load(open("reproduce/dataset/musique.json", "r"))

docs = [f"{doc['title']}\n{doc['text']}" for doc in corpus]
queries = [s["question"] for s in samples]

# Configure
config = BaseConfig(
    llm_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2",
    dataset="musique",
    retrieval_top_k=200,
    linking_top_k=5,
    rerank_dspy_file_path="src/remem/prompts/dspy_prompts/filter_llama3.3-70B-Instruct.json"
)

remem = ReMem(global_config=config, working_dir="./outputs/musique")
remem.index(docs)

# Retrieve
results = remem.retrieve(queries)

# Process results
for i, result in enumerate(results[:5]):
    print(f"\nQuery {i+1}: {result.question}")
    print(f"Top 3 passages:")
    for j, (doc, score) in enumerate(zip(result.docs[:3], result.doc_scores[:3])):
        print(f"  {j+1}. [{score:.4f}] {doc[:100]}...")

Extraction Method Impact

Retrieval behavior changes based on extraction method:

OpenIE (Default)

Uses fact-entity-passage graph traversal:
config = BaseConfig(extract_method="openie")

Episodic Methods

Uses specialized retrieval with multiple embedding stores:
config = BaseConfig(
    extract_method="episodic_gist",  # or "episodic", "temporal"
    agent_fixed_tools=False,  # Enable full tool selection
    agent_max_steps=5  # Maximum reasoning steps
)

Accessing Metadata

Retrieved passages include metadata when available:
results = remem.retrieve(queries)

for result in results:
    for doc, metadata in zip(result.docs[:5], result.doc_metadata[:5]):
        if metadata:
            print(f"Date: {metadata.get('date')}")
            print(f"Role: {metadata.get('role')}")
            print(f"Content: {metadata.get('content')}")

Performance Considerations

Embedding Cache

Query embeddings are cached automatically:
# From remem.py:492
self.get_query_embeddings(queries)  # Cached for reuse

Batch Retrieval

Process multiple queries efficiently:
# Better: batch processing
queries = ["query1", "query2", "query3", ...]
results = remem.retrieve(queries)  # Processes all at once

# Avoid: sequential calls
for query in queries:
    result = remem.retrieve([query])  # Slower

Next Steps

Question Answering

Use retrieved passages for QA

Evaluation

Evaluate retrieval performance

Build docs developers (and LLMs) love