Skip to main content

Quickstart

This guide will help you build your first REMem application in just a few minutes. You’ll learn how to index documents and query them using REMem’s hybrid memory graph.

Prerequisites

Before you begin, make sure you have:
  • Python 3.10 or higher installed
  • An OpenAI API key (for the LLM and embeddings)

Installation

1

Install REMem

Install REMem using pip:
pip install remem
For development or to run from source:
git clone https://github.com/intuit-ai-research/REMem.git
cd REMem
pip install -e .
2

Set up API keys

Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY="your-api-key-here"
You can also configure Azure OpenAI or use local models with vLLM. See the Installation guide for more options.

Your first REMem application

1

Import and configure

Start by importing REMem and creating a configuration:
from remem.remem import ReMem
from remem.utils.config_utils import BaseConfig

config = BaseConfig(
    dataset="sample",
    extract_method="episodic_gist",
    llm_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2",
)
The extract_method parameter determines how REMem processes your documents:
  • openie — Fast entity and triple extraction
  • episodic — Episodic fact extraction with context
  • episodic_gist — Adds gist memories for associative recall (recommended)
  • temporal — Best for time-sensitive questions
2

Initialize REMem

Create a REMem instance with your configuration:
rag = ReMem(global_config=config)
This initializes the hybrid memory graph, embedding stores, and extraction pipeline.
3

Index documents

Add documents to REMem’s memory graph:
docs = [
    "Alan Turing proposed the Turing Test in 1950.",
    "Grace Hopper pioneered COBOL and popularized the term 'debugging'.",
]
rag.index(docs)
During indexing, REMem:
  1. Chunks and normalizes your documents
  2. Extracts entities, facts, and gist traces
  3. Generates embeddings for semantic search
  4. Builds the hybrid memory graph with connections
Indexing can take a few moments depending on the number of documents and the extraction method. The results are cached for future use.
4

Query the system

Ask questions and get answers:
solutions, responses, meta = rag.rag_for_qa(
    ["Who proposed the Turing Test?", "Who worked on COBOL?"]
)

for s in solutions:
    print(s.question, "->", s.answer)
Output:
Who proposed the Turing Test? -> Alan Turing
Who worked on COBOL? -> Grace Hopper

Complete example

Here’s the complete code:
from remem.remem import ReMem
from remem.utils.config_utils import BaseConfig

# Configure REMem
config = BaseConfig(
    dataset="sample",
    extract_method="episodic_gist",
    llm_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2",
)

# Initialize
rag = ReMem(global_config=config)

# Index documents
docs = [
    "Alan Turing proposed the Turing Test in 1950.",
    "Grace Hopper pioneered COBOL and popularized the term 'debugging'.",
]
rag.index(docs)

# Query
solutions, responses, meta = rag.rag_for_qa(
    ["Who proposed the Turing Test?", "Who worked on COBOL?"]
)

# Print results
for s in solutions:
    print(s.question, "->", s.answer)

Running benchmarks

REMem includes support for several research benchmarks. To run a benchmark:
python main.py --dataset musique --llm_name gpt-4o-mini --embedding_name nvidia/NV-Embed-v2
Benchmark datasets are available in the reproduce/dataset/ directory. See the examples/ folder for dataset-specific scripts.

What’s in the response?

The rag_for_qa method returns three objects:
  • solutions: List of QuerySolution objects containing questions, answers, and reasoning traces
  • responses: Raw LLM responses
  • meta: Metadata including retrieval scores, graph traversal paths, and timing information

Configuration options

Here are some key configuration parameters:
config = BaseConfig(
    dataset="sample",
    extract_method="episodic_gist",
    llm_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2",
)

Next steps

Installation

Learn about different installation options and embedding models

Configuration

Deep dive into configuration options and extraction methods

Architecture

Understand how REMem’s hybrid memory graph works

Examples

Browse complete examples and benchmark scripts

Build docs developers (and LLMs) love