Skip to main content

What are Communities?

Communities in Graphiti are automatically detected clusters of related entities discovered through graph analysis. Think of them as topics or themes that emerge organically from your knowledge graph without manual categorization. For example, in a graph containing information about California politics, Graphiti might detect:
  • A community of “California State Government Officials”
  • A community of “San Francisco Political Figures”
  • A community of “Technology Industry Leaders”

How Communities Work

Label Propagation Algorithm

Graphiti uses label propagation, a graph clustering algorithm that groups entities based on their connectivity:
  1. Initialize: Each entity starts in its own community
  2. Propagate: Each entity adopts the most common community label among its neighbors
  3. Iterate: Repeat until community assignments stabilize
  4. Summarize: Generate natural language descriptions for each community
# From graphiti_core/utils/maintenance/community_operations.py:92
def label_propagation(projection: dict[str, list[Neighbor]]) -> list[list[str]]:
    """Implement the label propagation community detection algorithm.
    
    1. Start with each node being assigned its own community
    2. Each node will take on the community of the plurality of its neighbors
    3. Ties are broken by going to the largest community
    4. Continue until no communities change during propagation
    """

Example

Consider these entities and relationships:
Kamala Harris --[worked_with]--> Gavin Newsom
Kamala Harris --[succeeded]--> Jerry Brown
Gavin Newsom --[worked_with]--> Jerry Brown
Gavin Newsom --[appointed_by]--> Willie Brown
Label propagation will cluster these into a single community because they’re densely connected.

Community Schema

CommunityNode

class CommunityNode(Node):
    uuid: str
    name: str                         # Auto-generated description
    name_embedding: list[float] | None # Vector embedding for search
    summary: str                      # Aggregate summary of members
    group_id: str
    created_at: datetime

Example

CommunityNode(
    uuid="community-uuid-123",
    name="California State Government Leadership",
    summary="This community includes political figures who have held senior positions in California state government, including Governor, Attorney General, and Lieutenant Governor. Key members have worked together in various capacities and many have connections to San Francisco politics.",
    group_id="default",
    created_at=datetime(2024, 3, 15, tzinfo=timezone.utc)
)

CommunityEdge (HAS_MEMBER)

class CommunityEdge(Edge):
    uuid: str
    source_node_uuid: str  # CommunityNode UUID
    target_node_uuid: str  # EntityNode UUID
    group_id: str
    created_at: datetime
These edges link communities to their member entities.

Building Communities

Bulk Community Detection

Generate communities for all entities in your graph:
from graphiti_core.utils.maintenance.community_operations import build_communities

# Build communities for all groups
community_nodes, community_edges = await build_communities(
    driver=graphiti.driver,
    llm_client=graphiti.llm_client,
    group_ids=None  # Process all groups
)

# Build communities for specific groups
community_nodes, community_edges = await build_communities(
    driver=graphiti.driver,
    llm_client=graphiti.llm_client,
    group_ids=["user_123", "user_456"]
)

print(f"Detected {len(community_nodes)} communities")
Community building is a compute-intensive operation. For large graphs, consider running it as a background job or on a schedule rather than after every episode.

Incremental Community Updates

Update communities as you add new entities:
# Add episode with community updates enabled
result = await graphiti.add_episode(
    name="New Information",
    episode_body="Alice Chen joined the Platform Engineering team.",
    source=EpisodeType.text,
    reference_time=datetime.now(timezone.utc),
    update_communities=True  # Incrementally update communities
)

# Access updated communities
communities = result.communities
community_edges = result.community_edges
Setting update_communities=True adds latency to add_episode(). Use it when you need real-time community updates, otherwise run build_communities() periodically.

Community Generation Process

1. Graph Projection

Graphiti builds a weighted graph projection:
# For each entity, get connected entities and edge counts
projection: dict[str, list[Neighbor]] = {}

for node in nodes:
    # Find neighbors and count connections
    neighbors = await get_neighbors(node)
    projection[node.uuid] = neighbors
Example projection:
{
    "kamala-uuid": [
        Neighbor(node_uuid="gavin-uuid", edge_count=3),
        Neighbor(node_uuid="jerry-uuid", edge_count=2)
    ],
    "gavin-uuid": [
        Neighbor(node_uuid="kamala-uuid", edge_count=3),
        Neighbor(node_uuid="willie-uuid", edge_count=1)
    ]
}

2. Cluster Detection

cluster_uuids = label_propagation(projection)
# Returns: [["kamala-uuid", "gavin-uuid", "jerry-uuid"], ["willie-uuid", ...]]

3. Summary Generation

For each cluster, Graphiti:
  1. Collects entity summaries:
    summaries = [entity.summary for entity in community_cluster]
    
  2. Hierarchically merges summaries using LLM:
    # Pair-wise summarization
    while len(summaries) > 1:
        summary_pairs = [(summaries[i], summaries[i+1]) 
                         for i in range(0, len(summaries)-1, 2)]
        summaries = [await summarize_pair(llm_client, pair) 
                     for pair in summary_pairs]
    
  3. Generates community name:
    name = await generate_summary_description(llm_client, final_summary)
    # Returns: "California State Government Leadership"
    

4. Embedding Generation

await community_node.generate_name_embedding(embedder)
Enables semantic search over communities.

Querying Communities

Find All Communities

communities = await CommunityNode.get_by_group_ids(
    driver=graphiti.driver,
    group_ids=["default"],
    limit=50
)

for community in communities:
    print(f"{community.name}: {community.summary}")

Get Community Members

# Query members of a specific community
records, _, _ = await graphiti.driver.execute_query("""
    MATCH (c:Community {uuid: $community_uuid})-[:HAS_MEMBER]->(e:Entity)
    RETURN e.uuid AS uuid, e.name AS name, e.summary AS summary
""", community_uuid=community.uuid)

members = [record['name'] for record in records]
print(f"Community '{community.name}' has {len(members)} members: {members}")

Find Entity’s Community

# Find which community an entity belongs to
records, _, _ = await graphiti.driver.execute_query("""
    MATCH (c:Community)-[:HAS_MEMBER]->(e:Entity {uuid: $entity_uuid})
    RETURN c.uuid AS uuid, c.name AS name, c.summary AS summary
""", entity_uuid=alice.uuid)

if records:
    community_name = records[0]['name']
    print(f"Alice belongs to: {community_name}")

Updating Communities

Update Specific Community

When a new entity joins a community:
from graphiti_core.utils.maintenance.community_operations import update_community

# Update community to include new entity
communities, edges = await update_community(
    driver=graphiti.driver,
    llm_client=graphiti.llm_client,
    embedder=graphiti.embedder,
    entity=new_entity_node
)
This:
  1. Determines which community the entity belongs to (based on neighbors)
  2. Merges the entity’s summary with the community summary
  3. Regenerates the community name
  4. Updates embeddings

Rebuild All Communities

Remove old communities and regenerate:
from graphiti_core.utils.maintenance.community_operations import remove_communities

# Remove all existing communities
await remove_communities(driver=graphiti.driver)

# Rebuild from scratch
community_nodes, community_edges = await build_communities(
    driver=graphiti.driver,
    llm_client=graphiti.llm_client,
    group_ids=None
)
Rebuilding communities from scratch is useful after major changes to your graph structure or when community quality degrades over time.
Communities enable high-level topic retrieval:
# Search for communities (not individual entities)
results = await graphiti.search(
    query="California political leadership",
    # Search will match against community names and summaries
)

# Results may include CommunityNodes
for result in results:
    if isinstance(result, CommunityNode):
        print(f"Found community: {result.name}")
        print(f"Summary: {result.summary}")

Expand Community to Members

# Find a relevant community
community_results = await graphiti.search("tech industry leaders")
community_uuid = community_results[0].uuid

# Get all members
members = await graphiti.driver.execute_query("""
    MATCH (c:Community {uuid: $uuid})-[:HAS_MEMBER]->(e:Entity)
    RETURN e
""", uuid=community_uuid)

Use Cases

1. Topic Discovery

Identify themes in your knowledge graph:
communities = await CommunityNode.get_by_group_ids(
    driver=graphiti.driver,
    group_ids=["research_papers"]
)

print("Discovered research topics:")
for community in communities:
    print(f"- {community.name}")

2. Hierarchical Navigation

Browse from communities down to entities:
User asks: "What do you know about California politics?"
→ Show communities: "California State Government Leadership", "San Francisco Political Figures"
→ User selects: "California State Government Leadership"
→ Show members: Kamala Harris, Gavin Newsom, Jerry Brown
→ User selects: Kamala Harris
→ Show relationships: worked with Gavin Newsom, succeeded Jerry Brown

3. Contextual Retrieval

Use communities to scope search:
# Find the "Engineering" community
eng_community = await CommunityNode.get_by_uuid(driver, uuid="eng-comm-uuid")

# Get all entities in that community
member_uuids = [edge.target_node_uuid 
                for edge in await CommunityEdge.get_by_uuids(
                    driver, 
                    [e.uuid for e in eng_community_edges]
                )]

# Search only within those entities' relationships
results = await graphiti.search(
    query="who is working on the platform?",
    # Filter to only edges connected to engineering community members
)

4. Graph Summarization

Provide high-level overviews:
communities = await CommunityNode.get_by_group_ids(
    driver=graphiti.driver,
    group_ids=["customer_data"]
)

print(f"Your graph contains {len(communities)} main topics:")
for community in communities[:5]:  # Top 5
    print(f"\n{community.name}")
    print(f"{community.summary}")

Performance Considerations

Concurrency Control

Community building is parallelized but rate-limited:
# From community_operations.py:19
MAX_COMMUNITY_BUILD_CONCURRENCY = 10

# Limits concurrent LLM calls for summarization
semaphore = asyncio.Semaphore(MAX_COMMUNITY_BUILD_CONCURRENCY)

Graph Size Impact

For large graphs (>10,000 entities):
  • Label propagation scales to O(E) where E = number of edges
  • Summary generation scales to O(C * log(M)) where C = communities, M = avg members per community
For graphs with >50,000 entities, consider:
  • Running community detection on subgraphs (per group_id)
  • Using scheduled batch jobs instead of real-time updates
  • Increasing MAX_COMMUNITY_BUILD_CONCURRENCY if your LLM provider allows higher throughput

Algorithm Details

Edge Weight Consideration

The label propagation algorithm weights neighbor votes by edge count:
community_candidates: dict[int, int] = defaultdict(int)
for neighbor in neighbors:
    community_candidates[community_map[neighbor.node_uuid]] += neighbor.edge_count
Entities with more connections to a community have stronger influence.

Tie Breaking

When multiple communities have equal votes:
community_lst.sort(reverse=True)  # Sort by count descending
candidate_rank, community_candidate = community_lst[0]
The largest community wins ties, favoring consolidation.

Convergence

The algorithm terminates when assignments stop changing:
while True:
    no_change = True
    # ... propagation logic ...
    if no_change:
        break
Typically converges in 5-10 iterations for most graphs.

Best Practices

1. Run Community Detection Periodically

# In a scheduled job (e.g., daily)
async def rebuild_communities_job():
    await remove_communities(driver)
    communities, edges = await build_communities(
        driver, llm_client, group_ids=None
    )
    print(f"Rebuilt {len(communities)} communities")

2. Use Group IDs for Isolation

# Build communities per user
for user_id in user_ids:
    await build_communities(
        driver,
        llm_client,
        group_ids=[f"user_{user_id}"]
    )

3. Monitor Community Quality

communities = await CommunityNode.get_by_group_ids(driver, ["default"])

# Check for overly large communities (may need rebalancing)
large_communities = [c for c in communities if len(await get_members(c)) > 100]

# Check for singleton communities (may be noise)
singleton_communities = [c for c in communities if len(await get_members(c)) == 1]
# Use community membership as a search signal
results = await graphiti.search(
    "software engineers",
    # Boost results that belong to "Engineering" community
)

Removing Communities

Delete all communities (keeps entities intact):
from graphiti_core.utils.maintenance.community_operations import remove_communities

await remove_communities(driver=graphiti.driver)
# Deletes all CommunityNodes and CommunityEdges
# EntityNodes and EntityEdges are preserved

Source Code Reference

Key implementation files:
  • Label propagation: graphiti_core/utils/maintenance/community_operations.py:92
  • Community building: graphiti_core/utils/maintenance/community_operations.py:217
  • Summary generation: graphiti_core/utils/maintenance/community_operations.py:140
  • Community updates: graphiti_core/utils/maintenance/community_operations.py:326

Next Steps

Build Communities

Step-by-step guide to building and using communities

Nodes and Edges

Understand the complete graph schema

Search

Use communities in search queries

Knowledge Graphs

Back to knowledge graph fundamentals

Build docs developers (and LLMs) love