Skip to main content

Overview

The BaseConfig dataclass is the central configuration object for the ReMem framework. It controls all aspects of the system including LLM settings, embedding models, graph construction, retrieval, and evaluation parameters.

Constructor

@dataclass
class BaseConfig:
    # All parameters have sensible defaults
All parameters are optional with default values. You can override any subset of parameters.

Example

from remem.utils.config_utils import BaseConfig

# Use all defaults
config = BaseConfig()

# Override specific parameters
config = BaseConfig(
    llm_name="gpt-4o",
    embedding_model_name="nvidia/NV-Embed-v2",
    retrieval_top_k=20,
    qa_top_k=5
)

LLM Parameters

Configuration for language model behavior and API settings.

llm_name

llm_name
str
default:"\"gpt-4o-mini\""
Name of the language model to use for general inference.
config = BaseConfig(llm_name="gpt-4o")

extract_llm_label

extract_llm_label
str
default:"None"
Label for the LLM used in information extraction. Defaults to llm_name if not specified.

qa_llm_label

qa_llm_label
str
default:"None"
Label for the LLM used in question answering. Defaults to llm_name if not specified.

llm_base_url

llm_base_url
str
default:"None"
Base URL for the LLM API. If None, uses the default OpenAI service.
config = BaseConfig(
    llm_name="llama-3-70b",
    llm_base_url="http://localhost:8000/v1"
)

max_new_tokens

max_new_tokens
int
default:"2048"
Maximum number of new tokens to generate in each inference call.

num_gen_choices

num_gen_choices
int
default:"1"
Number of chat completion choices to generate for each input message.

seed

seed
Optional[int]
default:"None"
Random seed for reproducibility.

temperature

temperature
float
default:"0"
Sampling temperature for LLM generation. 0 means deterministic.

extract_format

extract_format
Literal["json_object", "json_schema"]
default:"None"
Response format specification for extraction tasks.

use_azure

use_azure
bool
default:"False"
Whether to use Azure OpenAI service instead of standard OpenAI.

max_num_seqs

max_num_seqs
int
default:"256"
Maximum number of sequences to generate for vLLM offline mode.

max_model_len

max_model_len
int
default:"4096"
Maximum context length (in tokens) for the model.

max_retries

max_retries
int
default:"10"
Maximum number of retry attempts for asynchronous API calls.

Storage & Indexing Parameters

force_openie_from_scratch

force_openie_from_scratch
bool
default:"False"
If True, ignores existing OpenIE results and rebuilds from scratch.

force_index_from_scratch

force_index_from_scratch
bool
default:"False"
If True, ignores all existing storage files and graph data and rebuilds from scratch.
Setting this to True will delete all previously indexed data and embeddings.

save_openie

save_openie
bool
default:"True"
Whether to save OpenIE extraction results to disk.

save_dir

save_dir
Optional[str]
default:"None"
Directory to save all related files. If not specified:
  • For dataset-specific runs: outputs/{dataset}/
  • For general use: outputs/
config = BaseConfig(save_dir="/path/to/my/output")

Text Preprocessing Parameters

text_preprocessor_class_name

text_preprocessor_class_name
str
default:"\"TextPreprocessor\""
Name of the text preprocessor class to use.

preprocess_encoder_name

preprocess_encoder_name
str
default:"\"gpt-4o\""
Name of the encoder for tokenization during document chunking.

preprocess_chunk_overlap_token_size

preprocess_chunk_overlap_token_size
int
default:"128"
Number of overlapping tokens between consecutive chunks.

preprocess_chunk_max_token_size

preprocess_chunk_max_token_size
Optional[int]
default:"None"
Maximum token size for each chunk. If None, treats the entire document as a single chunk.
config = BaseConfig(
    preprocess_chunk_max_token_size=512,
    preprocess_chunk_overlap_token_size=64
)

preprocess_chunk_func

preprocess_chunk_func
str
default:"\"by_token\""
Chunking function to use for document preprocessing.

Information Extraction Parameters

extract_method

extract_method
Literal["openie", "episodic", "episodic_gist", "temporal"]
default:"\"openie\""
Information extraction method:
  • "openie": Standard open information extraction (entities and triples)
  • "episodic": Episodic memory extraction for conversations
  • "episodic_gist": Episodic extraction with gist summaries
  • "temporal": Temporal event extraction
# For conversation data
config = BaseConfig(extract_method="episodic_gist")

# For standard documents
config = BaseConfig(extract_method="openie")

llm_infer_mode

llm_infer_mode
Literal["offline", "online"]
default:"\"online\""
  • "online": Use API-based LLM calls
  • "offline": Use vLLM for local batch inference

skip_graph

skip_graph
bool
default:"False"
Whether to skip graph construction. Set to True when running vLLM offline indexing for the first time.

vllm_tensor_parallel_size

vllm_tensor_parallel_size
int
default:"2"
Tensor parallel size for vLLM offline mode.

Embedding Parameters

embedding_model_name

embedding_model_name
str
default:"\"nvidia/NV-Embed-v2\""
Name of the embedding model to use.
config = BaseConfig(embedding_model_name="text-embedding-3-large")

embedding_batch_size

embedding_batch_size
int
default:"16"
Batch size for embedding model calls.

embedding_return_as_normalized

embedding_return_as_normalized
bool
default:"True"
Whether to normalize encoded embeddings.

embedding_max_seq_len

embedding_max_seq_len
int
default:"2048"
Maximum sequence length for the embedding model.

Graph Construction Parameters

concatenate_gists_per_chunk

concatenate_gists_per_chunk
bool
default:"False"
For episodic_gist method:
  • False: Each gist becomes a separate node
  • True: All gists in a chunk are joined into one node

split_verbatim_per_chunk

split_verbatim_per_chunk
bool
default:"True"
For conversation data:
  • True: Split multi-message conversations into individual message nodes
  • False: Keep each verbatim chunk as a single node

synonymy_edge_topk

synonymy_edge_topk
int
default:"2047"
Number of nearest neighbors (k) for KNN retrieval when building synonymy edges between entities.

synonymy_edge_query_batch_size

synonymy_edge_query_batch_size
int
default:"1000"
Batch size for query embeddings during synonymy edge construction.

synonymy_edge_key_batch_size

synonymy_edge_key_batch_size
int
default:"10000"
Batch size for key embeddings during synonymy edge construction.

synonymy_edge_sim_threshold

synonymy_edge_sim_threshold
float
default:"0.8"
Similarity threshold (0-1) for including candidate synonymy edges.

is_directed_graph

is_directed_graph
bool
default:"False"
Whether to construct a directed or undirected knowledge graph.

graph_type

graph_type
Literal["dpr_only", "facts_and_sim", "facts_and_sim_passage_node_unidirectional"]
default:"\"facts_and_sim_passage_node_unidirectional\""
Type of graph to construct:
  • "dpr_only": Dense passage retrieval only (no graph)
  • "facts_and_sim": Graph with facts and similarity edges
  • "facts_and_sim_passage_node_unidirectional": Facts, similarities, and unidirectional passage edges

Retrieval Parameters

linking_top_k

linking_top_k
int
default:"5"
Number of linked nodes to consider at each retrieval step.

retrieval_top_k

retrieval_top_k
int
default:"200"
Number of documents to retrieve for each query.
config = BaseConfig(
    retrieval_top_k=100,  # Retrieve 100 documents
    qa_top_k=5            # Use top 5 for QA
)

damping

damping
float
default:"0.5"
Damping factor for Personalized PageRank algorithm.

passage_node_weight

passage_node_weight
float
default:"0.05"
Multiplicative weight factor for passage nodes in PageRank.

rerank_dspy_file_path

rerank_dspy_file_path
Optional[str]
default:"None"
Path to a DSPy reranker model file for fact filtering.

Question Answering Parameters

qa_top_k

qa_top_k
int
default:"5"
Number of top-ranked documents to feed to the QA model.

qa_passage_prefix

qa_passage_prefix
str
default:"\"Wikipedia Title: \""
Prefix to add before each passage in the QA context.

qa_prompt_template

qa_prompt_template
Optional[str]
default:"None"
Name of the prompt template to use for QA tasks.

qa_reader

qa_reader
Literal["remem", "tiser"]
default:"\"remem\""
QA reader implementation to use.

Agent Parameters

For episodic and temporal extraction methods with agentic reasoning.

agent_fixed_tools

agent_fixed_tools
bool
default:"False"
  • True: Agent uses only semantic_retrieve + output_answer
  • False: Agent can select from full toolset

agent_max_steps

agent_max_steps
int
default:"5"
Maximum reasoning steps for the agent. For fixed_tools mode:
  • 1 = semantic_retrieve only
  • 2 = semantic_retrieve + output_answer

agent_fixed_retrieval_tool

agent_fixed_retrieval_tool
str
default:"\"semantic_retrieve\""
Which retrieval tool to use in fixed_tools mode: "semantic_retrieve" or "lexical_retrieve"

Evaluation Parameters

do_eval_retrieval

do_eval_retrieval
bool
default:"True"
Whether to perform evaluation on retrieval results.

do_eval_qa

do_eval_qa
bool
default:"True"
Whether to perform evaluation on QA results.
Evaluation requires gold-standard data (gold_docs for retrieval, gold_answers for QA).

Dataset Parameters

dataset

dataset
Optional[str]
default:"None"
Name of the dataset being used. If specified, customizes the save directory and potentially the prompt templates.
config = BaseConfig(dataset="musique")
# save_dir will be: outputs/musique/

corpus_len

corpus_len
Optional[int]
default:"None"
Length of the corpus to use (for testing with subsets).

Complete Configuration Example

from remem.utils.config_utils import BaseConfig

# Production configuration for conversation data
config = BaseConfig(
    # LLM settings
    llm_name="gpt-4o",
    temperature=0,
    max_new_tokens=2048,
    
    # Extraction settings
    extract_method="episodic_gist",
    llm_infer_mode="online",
    
    # Embedding settings
    embedding_model_name="nvidia/NV-Embed-v2",
    embedding_batch_size=32,
    
    # Preprocessing
    preprocess_chunk_max_token_size=512,
    preprocess_chunk_overlap_token_size=64,
    
    # Graph construction
    synonymy_edge_sim_threshold=0.85,
    is_directed_graph=False,
    concatenate_gists_per_chunk=True,
    split_verbatim_per_chunk=True,
    
    # Retrieval
    retrieval_top_k=100,
    linking_top_k=10,
    damping=0.5,
    passage_node_weight=0.05,
    
    # QA
    qa_top_k=5,
    qa_passage_prefix="Context: ",
    
    # Agent (for episodic methods)
    agent_fixed_tools=False,
    agent_max_steps=5,
    
    # Evaluation
    do_eval_retrieval=True,
    do_eval_qa=True,
    
    # Storage
    save_dir="outputs/my_experiment",
    force_index_from_scratch=False
)

print(config.save_dir)  # outputs/my_experiment

Configuration Best Practices

For Document QA

BaseConfig(
    extract_method="openie",
    preprocess_chunk_max_token_size=512,
    retrieval_top_k=200,
    qa_top_k=5
)

For Conversations

BaseConfig(
    extract_method="episodic_gist",
    split_verbatim_per_chunk=True,
    concatenate_gists_per_chunk=True
)

For Fast Experimentation

BaseConfig(
    llm_name="gpt-4o-mini",
    embedding_batch_size=64,
    retrieval_top_k=50
)

For Production

BaseConfig(
    llm_name="gpt-4o",
    max_retries=10,
    do_eval_retrieval=True,
    do_eval_qa=True,
    save_openie=True
)

Build docs developers (and LLMs) love