Graphiti allows you to define custom entity types and relationships to extract domain-specific information from your content. By default, Graphiti extracts generic Entity nodes, but you can define structured types with specific attributes.
Why Custom Entities?
Custom entity definitions enable:
Structured Extraction - Extract specific attributes (e.g., first_name, occupation)
Type Safety - Ensure entities have required fields
Domain Modeling - Model your specific domain (e.g., Products, Locations, Events)
Better Search - Filter and query by entity type and attributes
Schema Enforcement - Validate extracted data against your schema
Defining Entity Types
Entity types are defined using Pydantic BaseModel classes:
from pydantic import BaseModel, Field
class Person ( BaseModel ):
"""A human person, fictional or nonfictional."""
first_name: str | None = Field( description = "First name" )
last_name: str | None = Field( description = "Last name" )
occupation: str | None = Field( description = "The person's work occupation" )
class City ( BaseModel ):
"""A city or town."""
country: str | None = Field( description = "The country the city is in" )
Entity type names (class names) become labels in the graph. The class docstring helps the LLM understand when to use this type.
Using Custom Entities
Pass your entity types when adding episodes:
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
from datetime import datetime, timezone
graphiti = Graphiti(
uri = "bolt://localhost:7687" ,
user = "neo4j" ,
password = "password"
)
result = await graphiti.add_episode(
name = "Team Introduction" ,
episode_body = "Sarah Johnson is a software engineer. She works in San Francisco." ,
source = EpisodeType.text,
source_description = "Team directory" ,
reference_time = datetime.now(timezone.utc),
entity_types = { "Person" : Person, "City" : City}
)
# Extracted entities will have structured attributes
for node in result.nodes:
print ( f "Entity: { node.name } " )
print ( f "Type: { node.labels } " )
print ( f "Attributes: { node.attributes } " )
# Output: {'first_name': 'Sarah', 'last_name': 'Johnson', 'occupation': 'software engineer'}
Entity Type Dictionary
The entity_types parameter is a dictionary mapping type names to Pydantic models:
entity_types = {
"Person" : Person,
"City" : City,
"Company" : Company,
"Product" : Product
}
await graphiti.add_episode(
name = "Episode" ,
episode_body = "..." ,
source = EpisodeType.text,
source_description = "..." ,
reference_time = datetime.now(timezone.utc),
entity_types = entity_types
)
Defining Relationship Types
Define custom edge types to model specific relationships:
class IsPresidentOf ( BaseModel ):
"""Relationship between a person and the entity they are president of."""
pass
class InterpersonalRelationship ( BaseModel ):
"""A relationship between two people (e.g., knows, works with, interviewed)."""
pass
class LocatedIn ( BaseModel ):
"""A relationship indicating something is located in or associated with a place."""
pass
Relationship types can be empty classes. The class name and docstring guide the LLM on when to use each type.
Edge Type Mapping
Define which relationships can exist between entity types using an edge type map:
edge_types = {
"IS_PRESIDENT_OF" : IsPresidentOf,
"INTERPERSONAL_RELATIONSHIP" : InterpersonalRelationship,
"LOCATED_IN" : LocatedIn
}
edge_type_map = {
( "Person" , "Entity" ): [ "IS_PRESIDENT_OF" , "INTERPERSONAL_RELATIONSHIP" ],
( "Person" , "Person" ): [ "INTERPERSONAL_RELATIONSHIP" ],
( "Person" , "City" ): [ "LOCATED_IN" ],
( "Entity" , "City" ): [ "LOCATED_IN" ]
}
await graphiti.add_episode(
name = "Political Episode" ,
episode_body = "President Biden works closely with Vice President Harris." ,
source = EpisodeType.text,
source_description = "News article" ,
reference_time = datetime.now(timezone.utc),
entity_types = { "Person" : Person, "City" : City},
edge_types = edge_types,
edge_type_map = edge_type_map
)
Complete Example
Here’s a full example with custom entities and relationships:
from pydantic import BaseModel, Field
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
from datetime import datetime, timezone
# Define entity types
class Person ( BaseModel ):
"""A human person."""
first_name: str | None = Field( description = "First name" )
last_name: str | None = Field( description = "Last name" )
occupation: str | None = Field( description = "Work occupation" )
class City ( BaseModel ):
"""A city."""
country: str | None = Field( description = "The country the city is in" )
class Company ( BaseModel ):
"""A business organization."""
industry: str | None = Field( description = "Industry sector" )
founded_year: int | None = Field( description = "Year founded" )
# Define relationship types
class WorksFor ( BaseModel ):
"""Employment relationship."""
pass
class LocatedIn ( BaseModel ):
"""Location relationship."""
pass
# Create mappings
entity_types = {
"Person" : Person,
"City" : City,
"Company" : Company
}
edge_types = {
"WORKS_FOR" : WorksFor,
"LOCATED_IN" : LocatedIn
}
edge_type_map = {
( "Person" , "Company" ): [ "WORKS_FOR" ],
( "Company" , "City" ): [ "LOCATED_IN" ],
( "Person" , "City" ): [ "LOCATED_IN" ]
}
# Use in episode
graphiti = Graphiti(
uri = "bolt://localhost:7687" ,
user = "neo4j" ,
password = "password"
)
result = await graphiti.add_episode(
name = "Company Profile" ,
episode_body = """
Alice Chen is a software engineer at TechCorp.
TechCorp is a technology company founded in 2015.
TechCorp is headquartered in San Francisco, California.
""" ,
source = EpisodeType.text,
source_description = "Company directory" ,
reference_time = datetime.now(timezone.utc),
entity_types = entity_types,
edge_types = edge_types,
edge_type_map = edge_type_map
)
# Examine results
for node in result.nodes:
print ( f " \n Entity: { node.name } " )
print ( f "Labels: { node.labels } " )
print ( f "Attributes: { node.attributes } " )
for edge in result.edges:
print ( f " \n Relationship: { edge.name } " )
print ( f "Fact: { edge.fact } " )
Excluding Entity Types
Exclude certain entity types from extraction:
await graphiti.add_episode(
name = "Example" ,
episode_body = "Content with various entities" ,
source = EpisodeType.text,
source_description = "Source" ,
reference_time = datetime.now(timezone.utc),
entity_types = { "Person" : Person, "City" : City},
excluded_entity_types = [ "City" ] # Don't extract cities
)
To exclude the default Entity type:
excluded_entity_types = [ "Entity" ] # Only extract custom types
Field Descriptions
Provide clear field descriptions to guide extraction:
class Product ( BaseModel ):
"""A commercial product."""
name: str | None = Field(
description = "The product's commercial name or model number"
)
category: str | None = Field(
description = "Product category (e.g., electronics, clothing, food)"
)
price: float | None = Field(
description = "Price in USD"
)
manufacturer: str | None = Field(
description = "Name of the company that manufactures this product"
)
Good descriptions improve extraction accuracy. Be specific about format, units, and expected values.
Optional vs Required Fields
All fields should be optional (| None) since the LLM may not always extract every attribute:
class Person ( BaseModel ):
# Good: Optional fields with defaults
first_name: str | None = Field( default = None , description = "First name" )
last_name: str | None = Field( default = None , description = "Last name" )
# Avoid: Required fields
# first_name: str = Field(description="First name") # May fail extraction
Attribute Access
Extracted attributes are stored in the attributes dict:
for node in result.nodes:
if "Person" in node.labels:
first_name = node.attributes.get( "first_name" )
last_name = node.attributes.get( "last_name" )
occupation = node.attributes.get( "occupation" )
print ( f " { first_name } { last_name } - { occupation } " )
Bulk Operations with Custom Types
Custom types work with bulk episode ingestion:
from graphiti_core.utils.bulk_utils import RawEpisode
episodes = [
RawEpisode(
name = "Episode 1" ,
content = "Alice is an engineer at Google." ,
source = EpisodeType.text,
source_description = "Bio" ,
reference_time = datetime.now(timezone.utc)
),
RawEpisode(
name = "Episode 2" ,
content = "Bob is a designer at Apple." ,
source = EpisodeType.text,
source_description = "Bio" ,
reference_time = datetime.now(timezone.utc)
)
]
result = await graphiti.add_episode_bulk(
bulk_episodes = episodes,
entity_types = { "Person" : Person, "Company" : Company},
edge_types = edge_types,
edge_type_map = edge_type_map
)
Best Practices
Clear Docstrings Write descriptive docstrings to help the LLM identify when to use each type
Specific Descriptions Provide detailed field descriptions including format and units
Optional Fields Make all fields optional to handle incomplete extractions
Consistent Naming Use consistent naming conventions for types and fields
Common Patterns
Temporal Entities
class Event ( BaseModel ):
"""A notable event or occurrence."""
event_type: str | None = Field( description = "Type of event (meeting, conference, etc.)" )
start_date: str | None = Field( description = "When the event started (ISO format)" )
end_date: str | None = Field( description = "When the event ended (ISO format)" )
location: str | None = Field( description = "Where the event took place" )
Hierarchical Relationships
class Organization ( BaseModel ):
"""A formal organization."""
org_type: str | None = Field( description = "Type (company, nonprofit, government)" )
class ReportsTo ( BaseModel ):
"""Organizational reporting relationship."""
pass
edge_type_map = {
( "Person" , "Person" ): [ "REPORTS_TO" ],
( "Organization" , "Organization" ): [ "REPORTS_TO" ] # Org hierarchy
}
Validation
Custom entity types are validated when provided:
try :
await graphiti.add_episode(
name = "Test" ,
episode_body = "Content" ,
source = EpisodeType.text,
source_description = "Test" ,
reference_time = datetime.now(timezone.utc),
entity_types = { "InvalidType" : "not a BaseModel" } # Will raise error
)
except ValueError as e:
print ( f "Validation error: { e } " )
Next Steps
Adding Episodes Use custom entities when adding content
Searching Search for entities by type and attributes
Bulk Operations Use custom types in bulk ingestion