Skip to main content
Athena’s memory is just Markdown files. Any text you can export becomes part of your memory.

Overview

Importing data into Athena is as simple as copying Markdown files into your .context/ directory. The next time you run /start, Athena automatically:
  • Scans for new or changed files
  • Generates embeddings for semantic search
  • Updates the TAG_INDEX for entity lookup
  • Makes the content available for retrieval
The import process is non-destructive — your original files are preserved as-is.

Supported Sources

Exporting from ChatGPT

1

Access Settings

Go to Settings → Data Controls
2

Request Export

Click “Export data” and wait for the download link (usually emailed within 24 hours)
3

Extract Conversations

Download the ZIP file and extract. Look for conversations.json
4

Convert to Markdown

Convert JSON to Markdown files (one per conversation):
# Use a converter script or manually format
python3 scripts/convert_chatgpt.py conversations.json --output .context/memories/imports/
5

Clean up

Remove system messages, timestamps, and formatting artifacts that don’t add meaningful context.

What to Keep

  • ✅ Core conversation content
  • ✅ Insights and decisions
  • ✅ Code examples and solutions
  • ❌ System metadata
  • ❌ Timestamps and user IDs
  • ❌ Formatting artifacts

Import Process

How It Works

1

File Placement

Copy files to .context/memories/imports/ or any subdirectory under .context/
2

Session Boot

Run /start to trigger the boot orchestrator
3

Automatic Indexing

Athena detects new/changed files and:
  • Generates vector embeddings (for semantic search)
  • Extracts tags (for TAG_INDEX)
  • Updates metadata
4

Verification

Verify files were detected:
athena check

Directory Structure

Organize imported content logically:
.context/
├── memories/
│   ├── imports/          ← Drop zone for new imports
│   ├── chatgpt/          ← ChatGPT conversations
│   ├── gemini/           ← Gemini chats
│   └── claude/           ← Claude transcripts
├── projects/             ← Project-specific context
├── research/             ← Research notes and papers
└── archives/             ← Historical data

Best Practices

Clean before import

Remove timestamps, system messages, and formatting artifacts. Only import meaningful content.

Add tags

Use #hashtags in imported files for better TAG_INDEX coverage.

Batch by topic

Group related conversations in subdirectories (e.g., .context/memories/coding/).

Verify privacy

Don’t import files containing API keys, credentials, or sensitive personal data.

Cleaning Imported Data

For best results, clean up exported data before importing. This improves search quality and reduces noise.

What to Remove

❌ System metadata:
   "Assistant: I'm Claude, an AI assistant..."
   "User ID: 12345"
   
❌ Timestamps:
   "2024-01-15 14:23:45 UTC"
   
❌ Formatting artifacts:
   "<div class='message'>..."
   "[REDACTED]"
   
❌ Repetitive greetings:
   "How can I help you today?"
   "Is there anything else?"

What to Keep

✅ Core insights:
   "The key difference between X and Y is..."
   
✅ Decisions made:
   "We decided to use PostgreSQL because..."
   
✅ Code examples:
   ```python
   def process_data():
       ...
✅ Useful references: “According to the docs at https://…”

---

## Verification

### Check Files Were Indexed

After running `/start`, verify your imports:

**Test Semantic Search**

Test vector search finds your content:

```bash
python3 scripts/supabase_search.py "topic from imported file" --limit 5
You should see results from your imported files. Check TAG_INDEX Regenerate and check the index:
python3 scripts/generate_tag_index.py
grep -i "#your-tag" .context/TAG_INDEX.md
Run Athena Check Run the diagnostic tool:
athena check
Look for:
  • Files detected: X new files
  • Embeddings generated: X files
  • Tags extracted: X tags

Common Issues

”Files not detected on /start”

Cause: Files not in .context/ or ignored by .gitignore Fix:
# Check file location
ls -la .context/memories/imports/

# Check .gitignore doesn't exclude .md files
grep -i "*.md" .gitignore

“Search returns no results”

Cause: Embeddings not generated or Supabase sync pending Fix:
# Force sync to Supabase
python3 scripts/supabase_sync.py

# Wait a few minutes for indexing
# Then test search again

“Tags not appearing in TAG_INDEX”

Cause: Index not regenerated after import Fix:
python3 scripts/generate_tag_index.py

Advanced: Programmatic Import

For large-scale imports, automate the process:
#!/usr/bin/env python3
import json
import os
from pathlib import Path

def convert_chatgpt_export(json_path, output_dir):
    """Convert ChatGPT export JSON to Markdown files."""
    with open(json_path) as f:
        data = json.load(f)
    
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    for conversation in data:
        title = conversation.get('title', 'Untitled')
        messages = conversation.get('mapping', {})
        
        # Convert to markdown
        md_content = f"# {title}\n\n"
        for msg_id, msg_data in messages.items():
            msg = msg_data.get('message', {})
            if msg:
                role = msg.get('author', {}).get('role', '')
                content = msg.get('content', {}).get('parts', [''])[0]
                if content:
                    md_content += f"## {role.capitalize()}\n\n{content}\n\n"
        
        # Write file
        filename = f"{title.replace(' ', '-')[:50]}.md"
        (output_dir / filename).write_text(md_content)

if __name__ == '__main__':
    import sys
    convert_chatgpt_export(sys.argv[1], '.context/memories/imports/')
Usage:
python3 scripts/convert_chatgpt.py ~/Downloads/conversations.json

Security Considerations

Never import files containing:
  • API keys or credentials
  • .env files
  • Personal identifiable information (PII) you don’t want searchable
  • Proprietary code or trade secrets (unless in a private repo)

Privacy Filter

Before importing to a public repo, run the privacy scanner:
python3 .github/scripts/privacy_scan.py FILE_TO_IMPORT
If it flags issues, scrub the content or keep it in your private workspace.

Next Steps

Semantic Search

Learn how Athena retrieves your imported content using triple-path search

Best Practices

Operational discipline for maintaining your knowledge base

Build docs developers (and LLMs) love