Skip to main content

Installation

This guide covers everything you need to install and configure REMem for your use case.

Requirements

Python version

REMem requires Python 3.10 or higher. You can check your Python version:
python --version
REMem has been tested with Python 3.10, 3.11, and 3.12. We recommend using Python 3.10+ for the best compatibility.

System requirements

  • Memory: At least 8GB RAM (16GB+ recommended for large datasets)
  • GPU: Optional but recommended for faster embedding generation and offline LLM inference
  • Storage: Varies based on dataset size and caching

Installation methods

1

Install from PyPI

REMem is not yet published to PyPI. Use the source installation method below.
2

Install from source (recommended)

Clone the repository and install in editable mode:
# Clone the repository
git clone https://github.com/intuit-ai-research/REMem.git
cd REMem

# Install in editable mode
pip install -e .
This installs REMem along with all required dependencies from pyproject.toml.
3

Install from requirements.txt

Alternatively, install dependencies from the requirements file:
pip install -r requirements.txt
pip install -e .

Core dependencies

REMem automatically installs these core dependencies:

Graph and numerical computation

  • networkx (3.4.2) — Graph algorithms and structures
  • python_igraph (0.11.8) — Fast graph operations
  • numpy (1.26.4) — Numerical computing
  • scipy (1.14.1) — Scientific computing

Machine learning and embeddings

  • torch (2.6.0) — PyTorch for deep learning
  • sentence_transformers (3.3.1) — Embedding model interface
  • transformers (4.51.1) — Hugging Face transformers
  • nano_vectordb (0.0.4.3) — Lightweight vector database

LLM integration

  • openai (≥1.0.0) — OpenAI API client
  • vllm (0.8.5post1) — Offline LLM inference
  • dspy (2.5.29) — DSPy for prompt optimization
  • tiktoken (0.7.0) — Token counting

Utilities

  • tqdm (4.66.6) — Progress bars
  • tenacity (8.5.0) — Retry logic
  • pydantic (2.10.4) — Data validation
  • pandas — Data manipulation
  • nltk — Natural language processing
See pyproject.toml in the repository for the complete list of dependencies and version constraints.

API keys and configuration

OpenAI API

For online mode with OpenAI models:
export OPENAI_API_KEY="your-api-key-here"
Optionally set a custom base URL:
export OPENAI_BASE_URL="https://api.openai.com/v1"

Azure OpenAI

For Azure OpenAI deployments:
export AZURE_OPENAI_API_KEY="your-azure-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"

Environment variables

REMem respects these environment variables:
# CUDA configuration
export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES="0"  # Specify GPU

# Tokenizer settings
export TOKENIZERS_PARALLELISM="false"  # Disable tokenizer warnings

# Logging
export LOG_LEVEL="INFO"  # DEBUG, INFO, WARNING, ERROR

Embedding models

REMem supports multiple embedding models:
config = BaseConfig(
    embedding_model_name="nvidia/NV-Embed-v2",
)
NV-Embed-v2 provides state-of-the-art retrieval performance and is the default choice for most use cases.

OpenAI embeddings

config = BaseConfig(
    embedding_model_name="text-embedding-3-large",
)

GritLM

config = BaseConfig(
    embedding_model_name="gritlm/gritlm-7b",
)

Qwen3

config = BaseConfig(
    embedding_model_name="Qwen/Qwen3-Embedding-7B",
)
Local embedding models (NV-Embed-v2, GritLM, Qwen3) will be downloaded from Hugging Face on first use. Make sure you have sufficient disk space.

LLM backends

Online mode (OpenAI API)

Default mode for development and smaller workloads:
config = BaseConfig(
    llm_name="gpt-4o-mini",
    llm_base_url="https://api.openai.com/v1",
    llm_infer_mode="online",
)
Supported models:
  • gpt-4o-mini (recommended for cost-effectiveness)
  • gpt-4o
  • gpt-3.5-turbo

Offline mode (vLLM)

For batch processing and local inference:
config = BaseConfig(
    llm_name="meta-llama/Llama-3.1-8B-Instruct",
    llm_infer_mode="offline",
)
Offline mode is more efficient for large-scale indexing and benchmarking, as it batches requests and runs locally.

Verify installation

Run this simple script to verify your installation:
from remem.remem import ReMem
from remem.utils.config_utils import BaseConfig

print("REMem imported successfully!")

# Test basic initialization
config = BaseConfig(
    dataset="test",
    extract_method="openie",
    llm_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2",
)

rag = ReMem(global_config=config)
print(f"REMem initialized with working directory: {rag.working_dir}")
print("Installation verified!")
Expected output:
REMem imported successfully!
REMem initialized with working directory: outputs/test/ReMem_2026-03-03-12:00:00
Installation verified!

Troubleshooting

Import errors

If you encounter import errors, make sure all dependencies are installed:
pip install -e . --force-reinstall

CUDA/GPU issues

If you have GPU issues, ensure PyTorch is installed correctly:
# For CUDA 11.8
pip install torch --index-url https://download.pytorch.org/whl/cu118

# For CUDA 12.1
pip install torch --index-url https://download.pytorch.org/whl/cu121

Embedding model download

If embedding models fail to download, you can pre-download them:
from sentence_transformers import SentenceTransformer

# Pre-download NV-Embed-v2
model = SentenceTransformer("nvidia/NV-Embed-v2")

Out of memory errors

For large datasets, reduce batch sizes:
config = BaseConfig(
    embedding_batch_size=4,  # Reduce from default 8
    retrieval_top_k=100,     # Reduce candidates
)

Development setup

For development, install additional tools:
# Install development dependencies
pip install black ruff pytest

# Format code
black src/

# Lint code
ruff check src/
REMem uses:
  • Black for code formatting (line length: 120)
  • Ruff for linting (compatible with Python 3.10+)

Next steps

Quickstart

Build your first REMem application

Configuration

Learn about advanced configuration options

Benchmarks

Run REMem on research benchmarks

Examples

Browse complete code examples

Build docs developers (and LLMs) love