Skip to main content
Graphiti allows you to define custom entity types and relationships to extract domain-specific information from your content. By default, Graphiti extracts generic Entity nodes, but you can define structured types with specific attributes.

Why Custom Entities?

Custom entity definitions enable:
  • Structured Extraction - Extract specific attributes (e.g., first_name, occupation)
  • Type Safety - Ensure entities have required fields
  • Domain Modeling - Model your specific domain (e.g., Products, Locations, Events)
  • Better Search - Filter and query by entity type and attributes
  • Schema Enforcement - Validate extracted data against your schema

Defining Entity Types

Entity types are defined using Pydantic BaseModel classes:
from pydantic import BaseModel, Field

class Person(BaseModel):
    """A human person, fictional or nonfictional."""
    
    first_name: str | None = Field(description="First name")
    last_name: str | None = Field(description="Last name")
    occupation: str | None = Field(description="The person's work occupation")

class City(BaseModel):
    """A city or town."""
    
    country: str | None = Field(description="The country the city is in")
Entity type names (class names) become labels in the graph. The class docstring helps the LLM understand when to use this type.

Using Custom Entities

Pass your entity types when adding episodes:
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
from datetime import datetime, timezone

graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password"
)

result = await graphiti.add_episode(
    name="Team Introduction",
    episode_body="Sarah Johnson is a software engineer. She works in San Francisco.",
    source=EpisodeType.text,
    source_description="Team directory",
    reference_time=datetime.now(timezone.utc),
    entity_types={"Person": Person, "City": City}
)

# Extracted entities will have structured attributes
for node in result.nodes:
    print(f"Entity: {node.name}")
    print(f"Type: {node.labels}")
    print(f"Attributes: {node.attributes}")
    # Output: {'first_name': 'Sarah', 'last_name': 'Johnson', 'occupation': 'software engineer'}

Entity Type Dictionary

The entity_types parameter is a dictionary mapping type names to Pydantic models:
entity_types = {
    "Person": Person,
    "City": City,
    "Company": Company,
    "Product": Product
}

await graphiti.add_episode(
    name="Episode",
    episode_body="...",
    source=EpisodeType.text,
    source_description="...",
    reference_time=datetime.now(timezone.utc),
    entity_types=entity_types
)

Defining Relationship Types

Define custom edge types to model specific relationships:
class IsPresidentOf(BaseModel):
    """Relationship between a person and the entity they are president of."""
    pass

class InterpersonalRelationship(BaseModel):
    """A relationship between two people (e.g., knows, works with, interviewed)."""
    pass

class LocatedIn(BaseModel):
    """A relationship indicating something is located in or associated with a place."""
    pass
Relationship types can be empty classes. The class name and docstring guide the LLM on when to use each type.

Edge Type Mapping

Define which relationships can exist between entity types using an edge type map:
edge_types = {
    "IS_PRESIDENT_OF": IsPresidentOf,
    "INTERPERSONAL_RELATIONSHIP": InterpersonalRelationship,
    "LOCATED_IN": LocatedIn
}

edge_type_map = {
    ("Person", "Entity"): ["IS_PRESIDENT_OF", "INTERPERSONAL_RELATIONSHIP"],
    ("Person", "Person"): ["INTERPERSONAL_RELATIONSHIP"],
    ("Person", "City"): ["LOCATED_IN"],
    ("Entity", "City"): ["LOCATED_IN"]
}

await graphiti.add_episode(
    name="Political Episode",
    episode_body="President Biden works closely with Vice President Harris.",
    source=EpisodeType.text,
    source_description="News article",
    reference_time=datetime.now(timezone.utc),
    entity_types={"Person": Person, "City": City},
    edge_types=edge_types,
    edge_type_map=edge_type_map
)

Complete Example

Here’s a full example with custom entities and relationships:
from pydantic import BaseModel, Field
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
from datetime import datetime, timezone

# Define entity types
class Person(BaseModel):
    """A human person."""
    first_name: str | None = Field(description="First name")
    last_name: str | None = Field(description="Last name")
    occupation: str | None = Field(description="Work occupation")

class City(BaseModel):
    """A city."""
    country: str | None = Field(description="The country the city is in")

class Company(BaseModel):
    """A business organization."""
    industry: str | None = Field(description="Industry sector")
    founded_year: int | None = Field(description="Year founded")

# Define relationship types
class WorksFor(BaseModel):
    """Employment relationship."""
    pass

class LocatedIn(BaseModel):
    """Location relationship."""
    pass

# Create mappings
entity_types = {
    "Person": Person,
    "City": City,
    "Company": Company
}

edge_types = {
    "WORKS_FOR": WorksFor,
    "LOCATED_IN": LocatedIn
}

edge_type_map = {
    ("Person", "Company"): ["WORKS_FOR"],
    ("Company", "City"): ["LOCATED_IN"],
    ("Person", "City"): ["LOCATED_IN"]
}

# Use in episode
graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password"
)

result = await graphiti.add_episode(
    name="Company Profile",
    episode_body="""
    Alice Chen is a software engineer at TechCorp.
    TechCorp is a technology company founded in 2015.
    TechCorp is headquartered in San Francisco, California.
    """,
    source=EpisodeType.text,
    source_description="Company directory",
    reference_time=datetime.now(timezone.utc),
    entity_types=entity_types,
    edge_types=edge_types,
    edge_type_map=edge_type_map
)

# Examine results
for node in result.nodes:
    print(f"\nEntity: {node.name}")
    print(f"Labels: {node.labels}")
    print(f"Attributes: {node.attributes}")

for edge in result.edges:
    print(f"\nRelationship: {edge.name}")
    print(f"Fact: {edge.fact}")

Excluding Entity Types

Exclude certain entity types from extraction:
await graphiti.add_episode(
    name="Example",
    episode_body="Content with various entities",
    source=EpisodeType.text,
    source_description="Source",
    reference_time=datetime.now(timezone.utc),
    entity_types={"Person": Person, "City": City},
    excluded_entity_types=["City"]  # Don't extract cities
)
To exclude the default Entity type:
excluded_entity_types=["Entity"]  # Only extract custom types

Field Descriptions

Provide clear field descriptions to guide extraction:
class Product(BaseModel):
    """A commercial product."""
    
    name: str | None = Field(
        description="The product's commercial name or model number"
    )
    category: str | None = Field(
        description="Product category (e.g., electronics, clothing, food)"
    )
    price: float | None = Field(
        description="Price in USD"
    )
    manufacturer: str | None = Field(
        description="Name of the company that manufactures this product"
    )
Good descriptions improve extraction accuracy. Be specific about format, units, and expected values.

Optional vs Required Fields

All fields should be optional (| None) since the LLM may not always extract every attribute:
class Person(BaseModel):
    # Good: Optional fields with defaults
    first_name: str | None = Field(default=None, description="First name")
    last_name: str | None = Field(default=None, description="Last name")
    
    # Avoid: Required fields
    # first_name: str = Field(description="First name")  # May fail extraction

Attribute Access

Extracted attributes are stored in the attributes dict:
for node in result.nodes:
    if "Person" in node.labels:
        first_name = node.attributes.get("first_name")
        last_name = node.attributes.get("last_name")
        occupation = node.attributes.get("occupation")
        
        print(f"{first_name} {last_name} - {occupation}")

Bulk Operations with Custom Types

Custom types work with bulk episode ingestion:
from graphiti_core.utils.bulk_utils import RawEpisode

episodes = [
    RawEpisode(
        name="Episode 1",
        content="Alice is an engineer at Google.",
        source=EpisodeType.text,
        source_description="Bio",
        reference_time=datetime.now(timezone.utc)
    ),
    RawEpisode(
        name="Episode 2",
        content="Bob is a designer at Apple.",
        source=EpisodeType.text,
        source_description="Bio",
        reference_time=datetime.now(timezone.utc)
    )
]

result = await graphiti.add_episode_bulk(
    bulk_episodes=episodes,
    entity_types={"Person": Person, "Company": Company},
    edge_types=edge_types,
    edge_type_map=edge_type_map
)

Best Practices

Clear Docstrings

Write descriptive docstrings to help the LLM identify when to use each type

Specific Descriptions

Provide detailed field descriptions including format and units

Optional Fields

Make all fields optional to handle incomplete extractions

Consistent Naming

Use consistent naming conventions for types and fields

Common Patterns

Temporal Entities

class Event(BaseModel):
    """A notable event or occurrence."""
    event_type: str | None = Field(description="Type of event (meeting, conference, etc.)")
    start_date: str | None = Field(description="When the event started (ISO format)")
    end_date: str | None = Field(description="When the event ended (ISO format)")
    location: str | None = Field(description="Where the event took place")

Hierarchical Relationships

class Organization(BaseModel):
    """A formal organization."""
    org_type: str | None = Field(description="Type (company, nonprofit, government)")

class ReportsTo(BaseModel):
    """Organizational reporting relationship."""
    pass

edge_type_map = {
    ("Person", "Person"): ["REPORTS_TO"],
    ("Organization", "Organization"): ["REPORTS_TO"]  # Org hierarchy
}

Validation

Custom entity types are validated when provided:
try:
    await graphiti.add_episode(
        name="Test",
        episode_body="Content",
        source=EpisodeType.text,
        source_description="Test",
        reference_time=datetime.now(timezone.utc),
        entity_types={"InvalidType": "not a BaseModel"}  # Will raise error
    )
except ValueError as e:
    print(f"Validation error: {e}")

Next Steps

Adding Episodes

Use custom entities when adding content

Searching

Search for entities by type and attributes

Bulk Operations

Use custom types in bulk ingestion

Build docs developers (and LLMs) love