Episodes - Graphiti

What are Episodes?

Episodes are the fundamental units of information in Graphiti. An episode is any piece of input data—structured or unstructured—that you want to integrate into your knowledge graph. Think of episodes as the raw material that Graphiti processes to extract entities and relationships.

await graphiti.add_episode(
    name="User Message",
    episode_body="Alice started working at Acme Corp as a Senior Engineer.",
    source=EpisodeType.text,
    source_description="chat message",
    reference_time=datetime.now(timezone.utc),
)

When you add an episode, Graphiti:

Extracts entities (Alice, Acme Corp, Senior Engineer)
Identifies relationships (Alice —[works at]—> Acme Corp)
Creates an episodic node to preserve the original context
Links entities to the episode via MENTIONS edges

Episode Types

Graphiti supports three episode types defined in EpisodeType:

from graphiti_core.nodes import EpisodeType

class EpisodeType(Enum):
    message = 'message'  # Chat messages, conversations
    json = 'json'        # Structured data
    text = 'text'        # Plain text documents

EpisodeType.message

For conversational data with actor-content format:

await graphiti.add_episode(
    name="Customer Support Chat",
    episode_body="user: My order hasn't arrived yet.\nassistant: I'll check on that for you.",
    source=EpisodeType.message,
    source_description="support ticket #1234",
    reference_time=datetime.now(timezone.utc),
)

For EpisodeType.message, format content as "actor: content" on each line. Example: "user: Hello\nassistant: Hi there"

EpisodeType.json

For structured data objects:

import json

data = {
    "employee_name": "Alice Chen",
    "department": "Engineering",
    "start_date": "2024-01-15",
    "role": "Senior Software Engineer"
}

await graphiti.add_episode(
    name="HR System Update",
    episode_body=json.dumps(data),
    source=EpisodeType.json,
    source_description="HRIS database",
    reference_time=datetime(2024, 1, 15, tzinfo=timezone.utc),
)

Graphiti’s LLM extracts entities and relationships from the structured fields.

EpisodeType.text

For unstructured text documents:

await graphiti.add_episode(
    name="Company Blog Post",
    episode_body="""We're excited to announce our new product launch.
    The team, led by Sarah Johnson, has been working on this for 18 months.
    The product will be available in Q2 2024.""",
    source=EpisodeType.text,
    source_description="company blog",
    reference_time=datetime(2024, 3, 1, tzinfo=timezone.utc),
)

EpisodicNode Schema

When you add an episode, Graphiti creates an EpisodicNode in the graph:

class EpisodicNode(Node):
    name: str                      # Episode identifier
    content: str                   # Raw episode data
    source: EpisodeType           # Type of episode
    source_description: str        # Description of data source
    valid_at: datetime            # When the content occurred
    created_at: datetime          # When ingested into Graphiti
    entity_edges: list[str]       # UUIDs of extracted EntityEdges
    group_id: str                 # Partition identifier

Key Fields

name: A human-readable identifier for the episode

name="Customer Conversation 2024-03-15"

content: The raw episode body (can be empty if store_raw_episode_content=False)

content="user: What's the status of my order?\nassistant: Let me check that."

source: The episode type

source=EpisodeType.message

source_description: Metadata about where the episode came from

source_description="support_chat_session_789"

valid_at: When the episode content was created or occurred in the real world

valid_at=datetime(2024, 3, 15, 10, 30, tzinfo=timezone.utc)

created_at: When the episode was ingested into Graphiti

created_at=datetime(2024, 3, 20, 14, 45, tzinfo=timezone.utc)

entity_edges: List of EntityEdge UUIDs extracted from this episode

entity_edges=["uuid-123", "uuid-456", "uuid-789"]

The Episode Processing Pipeline

When you call add_episode(), Graphiti executes a multi-step pipeline:

1. Episode Creation

episode = EpisodicNode(
    name=name,
    content=episode_body,
    source=source,
    source_description=source_description,
    group_id=group_id,
    valid_at=reference_time,    # From your input
    created_at=utc_now(),       # Current time
)

2. Context Retrieval

Graphiti retrieves recent episodes for context:

# From graphiti_core/graphiti.py:895
previous_episodes = await self.retrieve_episodes(
    reference_time,
    last_n=RELEVANT_SCHEMA_LIMIT,  # Default: recent episodes
    group_ids=[group_id],
    source=source,
)

Previous episodes provide context for entity resolution. If “Alice” was mentioned in a prior episode, Graphiti recognizes it’s the same person.

3. Entity Extraction

LLM analyzes the episode to extract entities:

extracted_nodes = await extract_nodes(
    self.clients,
    episode,
    previous_episodes,
    entity_types,       # Optional custom types
    excluded_entity_types,
    custom_extraction_instructions,
)

From the episode “Alice started working at Acme Corp”, this extracts:

EntityNode(name=“Alice”, labels=[“Person”])
EntityNode(name=“Acme Corp”, labels=[“Organization”])

4. Node Deduplication

Resolves extracted nodes against existing graph entities:

nodes, uuid_map, duplicates = await resolve_extracted_nodes(
    self.clients,
    extracted_nodes,
    episode,
    previous_episodes,
    entity_types,
)

If “Alice” already exists in the graph, uuid_map maps the new extraction to the existing UUID.

5. Edge Extraction

LLM identifies relationships between entities:

extracted_edges = await extract_edges(
    self.clients,
    episode,
    extracted_nodes,
    previous_episodes,
    edge_type_map,
    group_id,
    edge_types,
    custom_extraction_instructions,
)

Extracts:

EntityEdge( source_node_uuid=alice_uuid, target_node_uuid=acme_uuid, name=“works_at”, fact=“Alice started working at Acme Corp as a Senior Engineer” )

6. Edge Resolution and Invalidation

Checks for duplicate or contradictory edges:

resolved_edges, invalidated_edges, new_edges = await resolve_extracted_edges(
    self.clients,
    edges,
    episode,
    nodes,
    edge_types,
    edge_type_map,
)

If Alice previously worked somewhere else, the old edge is temporally invalidated.

7. Attribute Extraction

Enriches entity nodes with summaries:

hydrated_nodes = await extract_attributes_from_nodes(
    self.clients,
    nodes,
    episode,
    previous_episodes,
    entity_types,
    edges=new_edges,
)

Adds a summary field to each EntityNode based on its edges.

8. Graph Persistence

Saves everything to the graph database:

await add_nodes_and_edges_bulk(
    self.driver,
    [episode],           # Episodic nodes
    episodic_edges,      # MENTIONS edges
    hydrated_nodes,      # Entity nodes
    entity_edges,        # RELATES_TO edges
    self.embedder,
)

Episodic Edges (MENTIONS)

Episodic edges connect episodes to the entities they mention:

class EpisodicEdge(Edge):
    source_node_uuid: str  # EpisodicNode UUID
    target_node_uuid: str  # EntityNode UUID
    created_at: datetime

These edges preserve provenance—you can always trace back which episodes mentioned which entities.

Example Query

# Find all episodes that mention Alice
episodes = await EpisodicNode.get_by_entity_node_uuid(
    driver,
    entity_node_uuid=alice.uuid
)

Reference Time vs Created Time

Understanding the temporal distinction is crucial:

Field	What It Represents	Example
`reference_time`	When the episode content occurred	Email sent on Jan 15, 2024
`valid_at`	Same as `reference_time` (stored in node)	2024-01-15
`created_at`	When Graphiti ingested the episode	Processed on Mar 20, 2024

# You find an old email from 2020
await graphiti.add_episode(
    name="Old Email",
    episode_body="Meeting notes from project kickoff.",
    source=EpisodeType.text,
    reference_time=datetime(2020, 3, 15, tzinfo=timezone.utc),  # Email date
    # created_at will be set to now (2024-03-20)
)

Graphiti uses reference_time to set the valid_at timestamp on extracted edges, ensuring facts are temporally grounded to when they occurred, not when they were processed.

Episode Retrieval

Graphiti provides methods to retrieve episodes:

Get Recent Episodes

episodes = await graphiti.retrieve_episodes(
    reference_time=datetime.now(timezone.utc),
    last_n=10,
    group_ids=["user_123"],
    source=EpisodeType.message,
)

Get by UUID

episode = await EpisodicNode.get_by_uuid(
    driver=graphiti.driver,
    uuid="episode-uuid-here"
)

Get by Group IDs

episodes = await EpisodicNode.get_by_group_ids(
    driver=graphiti.driver,
    group_ids=["group1", "group2"],
    limit=50
)

Bulk Episode Processing

For efficient batch ingestion, use add_episode_bulk():

from graphiti_core.utils.bulk_utils import RawEpisode

bulk_episodes = [
    RawEpisode(
        name=f"Email {i}",
        content=email_texts[i],
        source=EpisodeType.text,
        source_description="email archive",
        reference_time=email_dates[i],
    )
    for i in range(100)
]

result = await graphiti.add_episode_bulk(
    bulk_episodes=bulk_episodes,
    group_id="email_archive",
)

add_episode_bulk() processes multiple episodes in parallel for better performance but does not perform edge invalidation. Use add_episode() for incremental updates with contradiction handling.

Sagas: Organizing Episode Sequences

Sagas group related episodes into a narrative sequence:

# Add episodes to a saga
for i, message in enumerate(conversation_messages):
    result = await graphiti.add_episode(
        name=f"Message {i}",
        episode_body=message,
        source=EpisodeType.message,
        reference_time=message_timestamps[i],
        saga="customer_support_conversation_123",  # Saga name
    )

Graphiti automatically:

Creates a SagaNode (or reuses existing)
Creates HAS_EPISODE edges from saga to episodes
Creates NEXT_EPISODE edges to chain episodes in order

Querying Saga Episodes

# Retrieve episodes from a specific saga
episodes = await graphiti.retrieve_episodes(
    reference_time=datetime.now(timezone.utc),
    saga="customer_support_conversation_123"
)
# Returns episodes in chronological order

Custom Extraction Instructions

You can guide the LLM’s extraction process:

await graphiti.add_episode(
    name="Technical Doc",
    episode_body=documentation_text,
    source=EpisodeType.text,
    reference_time=datetime.now(timezone.utc),
    custom_extraction_instructions="""
        Focus on extracting:
        - Software components and their versions
        - Dependencies between components
        - Configuration parameters
    """
)

Storing Raw Content

Control whether to preserve episode content:

# Store raw content (default)
graphiti = Graphiti(
    uri, user, password,
    store_raw_episode_content=True  # Episodes keep .content field
)

# Discard raw content after processing
graphiti = Graphiti(
    uri, user, password,
    store_raw_episode_content=False  # Episodes have .content = ""
)

Set store_raw_episode_content=False to save database storage when you don’t need to retrieve the original episode text later.

Episode Best Practices

1. Use Descriptive Names

# Good: Descriptive and unique
name="Customer_Support_Chat_2024-03-15_14:30_Ticket_5678"

# Bad: Generic
name="episode"

2. Set Accurate Reference Times

# Good: Use the actual event time
reference_time=email_metadata['sent_date']

# Bad: Using current time for historical data
reference_time=datetime.now(timezone.utc)  # Wrong for old data!

3. Include Source Context

source_description="[email protected]"

4. Choose the Right Episode Type

# For chat messages
source=EpisodeType.message

# For API responses, database records
source=EpisodeType.json

# For documents, articles, emails
source=EpisodeType.text

5. Use Group IDs for Multi-Tenancy

# Separate graphs per user
group_id=f"user_{user_id}"

# Or per organization
group_id=f"org_{org_id}"

Get Started

Core Concepts

Guides

Integrations

Advanced

​What are Episodes?

​Episode Types

​EpisodeType.message

​EpisodeType.json

​EpisodeType.text

​EpisodicNode Schema

​Key Fields

​The Episode Processing Pipeline

​1. Episode Creation

​2. Context Retrieval

​3. Entity Extraction

​4. Node Deduplication

​5. Edge Extraction

​6. Edge Resolution and Invalidation

​7. Attribute Extraction

​8. Graph Persistence

​Episodic Edges (MENTIONS)

​Example Query

​Reference Time vs Created Time

​Episode Retrieval

​Get Recent Episodes

​Get by UUID

​Get by Group IDs

​Bulk Episode Processing

​Sagas: Organizing Episode Sequences

​Querying Saga Episodes

​Custom Extraction Instructions

​Storing Raw Content

​Episode Best Practices

​1. Use Descriptive Names

​2. Set Accurate Reference Times

​3. Include Source Context

​4. Choose the Right Episode Type

​5. Use Group IDs for Multi-Tenancy

​Episode Lifecycle

​Next Steps

Nodes and Edges

Add Episodes

Custom Entity Types

Search

Build docs developers (and LLMs) love

What are Episodes?

Episode Types

EpisodeType.message

EpisodeType.json

EpisodeType.text

EpisodicNode Schema

Key Fields

The Episode Processing Pipeline

1. Episode Creation

2. Context Retrieval

3. Entity Extraction

4. Node Deduplication

5. Edge Extraction

6. Edge Resolution and Invalidation

7. Attribute Extraction

8. Graph Persistence

Episodic Edges (MENTIONS)

Example Query

Reference Time vs Created Time

Episode Retrieval

Get Recent Episodes

Get by UUID

Get by Group IDs

Bulk Episode Processing

Sagas: Organizing Episode Sequences

Querying Saga Episodes

Custom Extraction Instructions

Storing Raw Content

Episode Best Practices

1. Use Descriptive Names

2. Set Accurate Reference Times

3. Include Source Context

4. Choose the Right Episode Type

5. Use Group IDs for Multi-Tenancy

Episode Lifecycle

Next Steps