The graph at a glance
3,205 nodes
Site origins, content categories, leaf-level tags, and 18 super-concept archetypes — all in a single authoritative JSON file.
12,880+ edges
Contains-edges, co-occurrence edges, and Hebbian-weighted activation edges that strengthen with use.
Automatic classification
Every post, idea, and media asset is tagged on ingest. No manual labeling required.
Self-improving weights
Nightly Hebbian decay strengthens recently activated nodes and allows dormant ones to fade, keeping the graph current without manual curation.
The canonical graph file lives at
Nodes/Universe/taxonomy_graph.json (~5.1MB). This is the single source of truth. All other copies are derivatives. Never edit this file manually — use the scripts in scripts/taxonomy/ to regenerate.Node types
The taxonomy organizes knowledge across four node types arranged in a hierarchy:| Node type | Description | Example |
|---|---|---|
super_concept | Top-level archetypes — 18 of them | Aesthetic_Lifestyle, Intimacy_Connection |
category | Mid-level groupings | Outdoor, Fitness, Cosplay |
tag | Leaf-level content tags | beach, yoga, latex |
site | Platform origin markers | onlyfans, fansly, reddit |
Super-concepts
The 18 super-concepts are the highest-level archetypes in the taxonomy. They act as anchor nodes that pull semantic weight from everything beneath them. When content is tagged withyoga, the system activates not just the yoga tag but propagates activation upward through Fitness and outward toward adjacent super-concepts like Aesthetic_Lifestyle.
The super-concepts are not editorial categories chosen by hand. They emerged from co-occurrence mining across real platform content data and represent the actual semantic clusters present in adult creator content.
The Directus collections
The taxonomy is reflected in two Directus collections that the MCP plugin reads and writes:taxonomy_dimensions — 6 super-concept dimensions
taxonomy_dimensions — 6 super-concept dimensions
Stores the six top-level classification dimensions used for structured tagging. Each dimension maps to a cluster of related super-concepts and provides the primary axis along which content is classified.This collection is read-only at runtime. You modify it by running
scripts/taxonomy/seed_taxonomy.mjs after updating the source data.taxonomy_mappings — 3,208 classified tags
taxonomy_mappings — 3,208 classified tags
taxonomy.core MCP plugin
All taxonomy operations are exposed through thetaxonomy.core plugin, part of the unified genie-mcp-server. The plugin has 7 tools:
search
Semantic search across the taxonomy graph. Returns ranked nodes matching a query string, including related super-concepts and co-occurrence neighbors.
tag-content
Classifies a piece of content — post, idea, or media asset — against the taxonomy. Writes tag assignments back to Directus and activates the corresponding graph nodes.
map-term
Resolves a raw term or creator-specific phrase to its canonical taxonomy node. Handles synonyms, abbreviations, and platform-specific slang.
ingest-source
Ingests a new data source (scraped content, CSV, or URL) into the taxonomy pipeline. Tags are extracted, classified, and written to
taxonomy_mappings.rebuild-graph
Triggers a full taxonomy graph rebuild from the current state of
taxonomy_mappings in Directus. Equivalent to running seed_taxonomy.mjs but available as an MCP tool call at runtime.prune
Removes low-weight, low-frequency nodes from the graph. Runs after Hebbian decay to evict nodes that have not been activated recently and fall below the retention threshold.
strengthen
Manually reinforces an edge between two nodes — equivalent to a Hebbian activation without content input. Used by the nightly consolidation cycle when promoting cross-user pattern candidates.
Synaptic propagation and retrieval
Tagging content is only the first step. The real value of the taxonomy graph is how it feeds the retrieval system. When a user query arrives, the retrieved seed nodes are used as starting points for synaptic propagation — a Leaky Integrate-and-Fire (LIF) neuron model that walks the graph outward from the seeds, activating adjacent nodes weighted by edge strength. A query about “beach yoga” will activate:- The
beachandyogaleaf nodes directly - Their parent categories:
Outdoor,Fitness - Related super-concepts:
Aesthetic_Lifestyle,Body_Expression - Co-occurrence neighbors: nodes that frequently appear alongside yoga or outdoor content in the graph
The synaptic propagation implementation lives in
memory/retrieval/synaptic/ — specifically propagate_from_seeds, strengthen_edge, and lif_neurons. The taxonomy graph in Nodes/Universe/taxonomy_graph.json is what it walks.The graph format
The taxonomy graph is stored as a plain JSON file, backed by pgvector embeddings for dense similarity search. The JSON schema:contains— parent-to-child structural relationshipco_occurrence— two nodes appear together frequently in real contenthebbian— edges added or strengthened by Hebbian reinforcement over time
Node lifecycle
Nodes move through a three-tier hierarchy as they gain confidence:1. Tag appears in content
1. Tag appears in content
A new tag surfaces in a creator’s content. The
taxonomy.core plugin classifies it against taxonomy_mappings and activates the corresponding node in the creator’s user_nodes record in Directus.2. Session promotion
2. Session promotion
After the session, confirmed activations are written to
Nodes/User/{creator-uuid}/. The node now exists at the per-creator level with an initial weight.3. Hebbian decay
3. Hebbian decay
Every night,
memory/consolidation/hebbian/node-decay.mjs runs across all user nodes. Nodes that were activated recently have their weights increased. Nodes that have not been activated decay toward zero.4. Cross-user promotion
4. Cross-user promotion
When the same node pattern appears across multiple creator profiles,
memory/consolidation/cross_user/fp_growth.mjs promotes the pattern to Nodes/Transitional/. This is currently in progress (sprint B7-3).5. Universe promotion
5. Universe promotion
After consolidation review, transitional nodes are merged into
Nodes/Universe/taxonomy_graph.json, becoming part of the canonical taxonomy. The knowledge accretes.Scripts for taxonomy management
All taxonomy build and maintenance tools live inscripts/taxonomy/:
| Script | Purpose |
|---|---|
seed_taxonomy.mjs | Full rebuild from Directus taxonomy_mappings data. Regenerates taxonomy_graph.json. |
process_dataset.py | Original extractor — processes staging CSV data, classifies tags, writes to Directus. |
enforce_taxonomy.mjs | Validates the graph against the schema, flags orphaned nodes and broken edges. |
reclassify.mjs | Reclassifies existing content against an updated taxonomy — run after adding new super-concepts or restructuring categories. |
Related
Synaptic propagation
How LIF neurons walk the taxonomy graph during retrieval
Hebbian consolidation
Nightly decay and cross-user pattern promotion
taxonomy.core MCP plugin
Full reference for all 7 taxonomy tools
JIT skill graph
The DuckDB skill graph and how skills are surfaced just-in-time