Skip to main content
Athena uses Triple-Path Retrieval to ensure no relevant context is missed. Each search method catches what the others miss.

The Architecture

                               USER QUERY


               ┌───────────────────┴───────────────────┐
               │        TRIPLE-PATH RETRIEVAL          │
               └───────────────────┬───────────────────┘

         ┌─────────────────────────┼─────────────────────────┐
         │                         │                         │
         ▼                         ▼                         ▼
 ┌───────────────┐        ┌───────────────┐        ┌───────────────┐
 │    PATH 1     │        │    PATH 2     │        │    PATH 3     │
 │               │        │               │        │               │
 │  🔮 VECTOR    │        │  🏷️ TAG      │        │  🔎 KEYWORD   │
 │   SEARCH      │        │   INDEX       │        │    GREP       │
 │               │        │               │        │               │
 │  (Semantic)   │        │  (Hashtags)   │        │  (Exact)      │
 └───────┬───────┘        └───────┬───────┘        └───────┬───────┘
         │                         │                         │
         ▼                         ▼                         ▼
  "decentralized"          "#leadership"           "Protocol 139"
  → finds related           → finds tagged         → finds exact
    concepts                   entities               matches
         │                         │                         │
         └─────────────────────────┼─────────────────────────┘


                         ┌─────────────────┐
                         │  MERGED CONTEXT │
                         └─────────────────┘

Why Three Paths?

Each retrieval method has strengths and blind spots:
PathCatchesMisses
VectorSynonyms, paraphrases, conceptsExact names, entities
TAG_INDEXExplicitly tagged entitiesUntagged content
Keyword GrepExact string matchesSemantic variations

Example: Searching for an Entity

Query: “Find information about Protocol 139”
  • Vector search returns:
    • Documents about “decentralized leadership” (semantically related)
    • Files discussing “command structure” (conceptually similar)
  • TAG_INDEX returns:
    • #leadership → protocols/139-decentralized-command.md (exact entity match)
    • Files explicitly tagged with #protocol-139
  • Keyword grep returns:
    • Any file containing the literal string “Protocol 139”
    • Recent uncommitted files not yet in Supabase
Result: The combination finds the protocol file directly (TAG_INDEX), related concepts (Vector), and any recent mentions (Grep).

How It Works

1

Query embedding

Your search query is converted to a 3072-dimension embedding using Gemini API.
2

Similarity search

Cosine similarity search runs across 11 Supabase tables containing your workspace content.
3

Ranked results

Returns top matches ranked by semantic similarity score.

Usage

python3 scripts/supabase_search.py "<query>" --limit 5

Strengths

Conceptual matching

Finds “fitness tracking” when you search for “health monitoring”

Synonym handling

Understands “automobile” and “car” are related

Paraphrase detection

Matches different phrasings of the same idea

Context awareness

Ranks by relevance to your query context

Limitations

  • May miss exact entity names (“John Smith” vs “J. Smith”)
  • Requires content to be synced to Supabase
  • Can’t find content created seconds ago

Path 2: TAG_INDEX Lookup

How It Works

1

Tag extraction

The generate_tag_index.py script scans all workspace files for:
  • Inline #hashtags in Markdown
  • YAML frontmatter tags
  • Protocol IDs and entity names
2

Reverse index creation

Creates a lookup table: #tag → [file1, file2, ...]
3

Instant lookup

When you search for a tagged entity, grep the TAG_INDEX for immediate results.

Example Output

| #leadership | protocols/139-decentralized-command.md |
| #archetype  | user_profile/Archetype_Example.md |
| #vectorrag  | docs/VECTORRAG.md, docs/SEMANTIC_SEARCH.md |

Usage

grep -i "<entity>" .context/TAG_INDEX.md

Generating the Index

python3 scripts/generate_tag_index.py
Current stats (as of Feb 2026):
  • 1000+ tags indexed
  • 4 directories scanned (.context/, .agent/, examples/protocols/, user_profile/)
  • Extraction methods: YAML frontmatter + inline hashtags

Strengths

Instant lookup

Zero latency for tagged entities

Exact matches

No false positives for entity names

Manual curation

You control what gets tagged and how

No API costs

Pure filesystem operation

Limitations

  • Only finds explicitly tagged content
  • Requires manual tagging discipline
  • Misses untagged but relevant files

Path 3: Keyword Grep

How It Works

Simple string matching across all files:
grep -ri "<keyword>" .context/ .agent/

Strengths

Zero false negatives

If the string exists, grep finds it

Finds new files

Catches content not yet synced to Supabase

Exact phrases

Finds literal matches like “Protocol 139”

No dependencies

Works offline, no API or database needed

Limitations

  • No semantic understanding
  • High false positive rate for common words
  • Case sensitivity issues (use -i flag)

When to Use Each Path

Use: Vector Search (primary)Examples:
  • “What did we discuss about X?”
  • “Show me files related to leadership”
  • “Find documents about decentralized systems”
Why: Semantic matching finds conceptually related content even with different wording.

The Search Protocol (§0.7.1)

Per Core Identity, every query triggers semantic context retrieval:
1

Vector Search (Always)

python3 scripts/supabase_search.py "<query>" --limit 5
Runs first to capture semantic context.
2

Entity Lookup (Conditional)

grep -i "<entity_name>" .context/TAG_INDEX.md
If named entities detected in the query.
3

Fallback Grep (As Needed)

grep -ri "<keyword>" .context/ .agent/
If above methods return sparse results.

Performance Comparison

Before Triple-Path (Vector Only)

ScenarioResult
Search entity name❌ Missed related protocol
Search archetype❌ Missed profile file
Search “decentralized”✅ Found semantically
New unsynced file❌ Not in Supabase yet

After Triple-Path

ScenarioResult
Search entity name✅ Found via TAG_INDEX
Search archetype✅ Found via TAG_INDEX
Search “decentralized”✅ Still works (Vector)
New unsynced file✅ Found via grep

Best Practices

Tag important entities

Add #tags to protocols, workflows, and key documents for instant lookup.

Sync regularly

Run supabase_sync.py to keep your vector embeddings current.

Regenerate TAG_INDEX

Run generate_tag_index.py after adding new files or tags.

Use all three paths

For important searches, don’t rely on just one method.

Next Steps

Multi-Model Strategy

Learn how to route different tasks to different AI models

Importing Data

Bring existing knowledge into your Athena workspace

Build docs developers (and LLMs) love