Memory Configuration

DeerFlow’s memory system enables agents to remember user preferences, context, and conversation history across sessions. The system automatically extracts, stores, and injects relevant facts into agent prompts.

Overview

The memory system provides:

Fact Extraction

Automatically extracts key facts from conversations using LLM analysis

Persistent Storage

Stores facts in JSON format with confidence scores and timestamps

Context Injection

Intelligently injects relevant facts into agent system prompts

Debounced Updates

Batches updates to reduce LLM calls and improve performance

Configuration

Memory is configured in the memory section of config.yaml:

config.yaml

memory:
  enabled: true
  storage_path: memory.json
  debounce_seconds: 30
  model_name: null
  max_facts: 100
  fact_confidence_threshold: 0.7
  injection_enabled: true
  max_injection_tokens: 2000

Configuration Options

enabled

boolean

default:"true"

Whether to enable the memory system globally.Set to false to disable memory extraction and injection.

storage_path

string

default:"memory.json"

Path to store memory data.Path Resolution:

Empty string ("") → {DEER_FLOW_HOME}/memory.json (default)
Relative path → {DEER_FLOW_HOME}/{storage_path}
Absolute path → Used as-is

Where DEER_FLOW_HOME is:

DEER_FLOW_HOME environment variable, or
.deer-flow/ in backend directory (dev mode), or
~/.deer-flow/ (default)

Migration Note: If you previously set storage_path: .deer-flow/memory.json, it will now resolve to {DEER_FLOW_HOME}/.deer-flow/memory.json. Use an absolute path to preserve the old location.

debounce_seconds

integer

default:"30"

Seconds to wait before processing queued memory updates.How it works:

Memory updates are queued during conversation
After debounce_seconds of inactivity, updates are batched and processed
Reduces LLM calls and API costs

Tuning:

Lower values (10-30s) → More frequent updates, higher costs
Higher values (60-300s) → Less frequent updates, lower costs

model_name

string

default:"null"

Model to use for memory extraction and updates.

null → Uses the default model (first in models list)
Specify model name → Uses that configured model

Recommendation: Use a lightweight, cost-effective model like gpt-4o-mini for memory operations.

max_facts

integer

default:"100"

Maximum number of facts to store in memory.When the limit is reached:

Oldest facts (by timestamp) are removed first
Or lowest confidence facts if timestamps are equal

fact_confidence_threshold

float

default:"0.7"

Minimum confidence score (0.0-1.0) required to store a fact.Facts with confidence below this threshold are discarded.Tuning:

Higher values (0.8-1.0) → Only high-confidence facts stored
Lower values (0.5-0.7) → More facts stored, potentially less accurate

injection_enabled

boolean

default:"true"

Whether to inject memory facts into agent system prompts.Set to false to store facts without injecting them (passive mode).

max_injection_tokens

integer

default:"2000"

Maximum tokens to use for memory injection in system prompts.Facts are prioritized by confidence and recency, then truncated to fit this limit.Tuning:

Lower values (500-1000) → Only highest priority facts injected
Higher values (2000-4000) → More comprehensive context

Storage Format

Memory is stored as JSON with the following structure:

memory.json

{
  "facts": [
    {
      "content": "User prefers Python over JavaScript for backend development",
      "confidence": 0.95,
      "timestamp": "2026-03-04T10:30:00Z",
      "source": "conversation"
    },
    {
      "content": "User's primary tech stack is React, FastAPI, and PostgreSQL",
      "confidence": 0.88,
      "timestamp": "2026-03-04T10:32:15Z",
      "source": "conversation"
    },
    {
      "content": "User works on a project called 'DeerFlow'",
      "confidence": 1.0,
      "timestamp": "2026-03-04T10:35:42Z",
      "source": "conversation"
    }
  ],
  "metadata": {
    "last_updated": "2026-03-04T10:35:42Z",
    "total_facts": 3,
    "version": "1.0"
  }
}

Fact Fields

content: The extracted fact as a natural language statement
confidence: Confidence score (0.0-1.0) assigned by the LLM
timestamp: ISO 8601 timestamp when the fact was extracted
source: Source of the fact (typically "conversation")

How Memory Works

Conversation Analysis

As the user interacts with the agent, conversation messages are analyzed for extractable facts.

Fact Extraction

The memory system uses an LLM to extract key facts:

User preferences and habits
Project information
Technical context
Personal details (when relevant)

Confidence Scoring

Each extracted fact is assigned a confidence score:

0.9-1.0: Explicit statements (“I prefer X”)
0.7-0.9: Strong inference (“I always use X”)
0.5-0.7: Weak inference (“I might use X”)
Below 0.5: Discarded (below threshold)

Debounced Storage

Facts are queued and stored after debounce_seconds of inactivity to batch updates.

Fact Pruning

If max_facts is exceeded:

Sort facts by timestamp (oldest first)
Remove oldest facts until within limit
Optionally consider confidence scores

Context Injection

When injection_enabled is true:

Top facts (by confidence and recency) are selected
Facts are formatted and injected into the system prompt
Injection is truncated to max_injection_tokens

Memory Injection Format

When memory is injected into the agent’s system prompt:

## User Context

Based on previous interactions, here's what I know about you:

- You prefer Python over JavaScript for backend development
- Your primary tech stack is React, FastAPI, and PostgreSQL
- You work on a project called 'DeerFlow'
- You prefer detailed, technical explanations

I'll use this context to provide more personalized and relevant assistance.

The exact injection format is determined by the agent’s prompt template. The above is an example.

Configuration Examples

Minimal Memory (Cost-Optimized)

For minimal API usage:

memory:
  enabled: true
  debounce_seconds: 120  # 2 minutes - less frequent updates
  max_facts: 50          # Fewer facts stored
  fact_confidence_threshold: 0.8  # Only high-confidence facts
  max_injection_tokens: 1000  # Smaller context injection

Comprehensive Memory

For maximum context retention:

memory:
  enabled: true
  debounce_seconds: 30   # Frequent updates
  max_facts: 500         # Many facts stored
  fact_confidence_threshold: 0.6  # Lower threshold
  max_injection_tokens: 4000  # Large context injection

Memory Without Injection

Store facts but don’t inject them (for analysis only):

memory:
  enabled: true
  storage_path: memory.json
  injection_enabled: false  # Facts stored but not injected
  debounce_seconds: 60

Custom Storage Location

memory:
  enabled: true
  storage_path: /var/lib/deerflow/memory.json  # Absolute path

Per-User Memory

For multi-tenant setups, use environment variables:

memory:
  enabled: true
  storage_path: $DEER_FLOW_HOME/users/${USER_ID}/memory.json

Programmatic Access

Access memory configuration in Python:

from src.config.memory_config import get_memory_config

config = get_memory_config()

if config.enabled:
    print(f"Memory enabled: {config.enabled}")
    print(f"Storage path: {config.storage_path}")
    print(f"Max facts: {config.max_facts}")
    print(f"Confidence threshold: {config.fact_confidence_threshold}")
    print(f"Injection enabled: {config.injection_enabled}")

Update Configuration at Runtime

from src.config.memory_config import get_memory_config, set_memory_config, MemoryConfig

# Get current config
config = get_memory_config()

# Modify and update
config.debounce_seconds = 60
config.max_facts = 200
set_memory_config(config)

Best Practices

Use a Lightweight Model

Memory operations don’t need powerful models. Use a cost-effective model:

models:
  - name: gpt-4o-mini  # Fast and cheap
    # ...

memory:
  model_name: gpt-4o-mini  # Use for memory operations

Tune Debounce for Your Use Case

Interactive applications: 30-60 seconds
Long-running tasks: 120-300 seconds
Cost-sensitive: Higher values

Set Appropriate Fact Limits

Personal assistant: 100-200 facts
Project-specific agent: 200-500 facts
Multi-user system: Separate memory files per user

Monitor Storage Size

Regularly check memory file size:

ls -lh ~/.deer-flow/memory.json

If too large, reduce max_facts or increase fact_confidence_threshold.

Backup Memory Data

Memory files contain valuable context. Back them up regularly:

cp ~/.deer-flow/memory.json ~/.deer-flow/memory.backup.json

Memory Lifecycle

Troubleshooting

Memory not persisting

Check storage path and permissions:

# Verify path exists
ls -la ~/.deer-flow/

# Check file permissions
ls -l ~/.deer-flow/memory.json

# Ensure writable
chmod 644 ~/.deer-flow/memory.json

Too many/few facts extracted

Adjust fact_confidence_threshold:

memory:
  fact_confidence_threshold: 0.8  # Increase for fewer facts
  # OR
  fact_confidence_threshold: 0.6  # Decrease for more facts

Memory updates too frequent/infrequent

Tune debounce_seconds:

memory:
  debounce_seconds: 60   # Increase for less frequent updates
  # OR
  debounce_seconds: 15   # Decrease for more frequent updates

Context injection too large

Reduce max_injection_tokens:

memory:
  max_injection_tokens: 1000  # Smaller injection

Overview

Core Concepts

Configuration

Guides

Deployment

Memory Configuration

Overview

Fact Extraction

Persistent Storage

Context Injection

Debounced Updates

Configuration

Configuration Options

Storage Format

Fact Fields

How Memory Works

Memory Injection Format

Configuration Examples

Minimal Memory (Cost-Optimized)

Comprehensive Memory

Memory Without Injection

Custom Storage Location

Per-User Memory

Programmatic Access

Update Configuration at Runtime

Best Practices

Memory Lifecycle

Troubleshooting

Next Steps

Environment Variables

Agent Customization

Build docs developers (and LLMs) love

Overview

Core Concepts

Configuration

Guides

Deployment

​Overview

Fact Extraction

Persistent Storage

Context Injection

Debounced Updates

​Configuration

​Configuration Options

​Storage Format

​Fact Fields

​How Memory Works

​Memory Injection Format

​Configuration Examples

​Minimal Memory (Cost-Optimized)

​Comprehensive Memory

​Memory Without Injection

​Custom Storage Location

​Per-User Memory

​Programmatic Access

​Update Configuration at Runtime

​Best Practices

​Memory Lifecycle

​Troubleshooting

​Next Steps

Environment Variables

Agent Customization

Build docs developers (and LLMs) love

Overview

Configuration

Configuration Options

Storage Format

Fact Fields

How Memory Works

Memory Injection Format

Configuration Examples

Minimal Memory (Cost-Optimized)

Comprehensive Memory

Memory Without Injection

Custom Storage Location

Per-User Memory

Programmatic Access

Update Configuration at Runtime

Best Practices

Memory Lifecycle

Troubleshooting

Next Steps