Introduction

What is TypeAgent?

TypeAgent is a Python library that implements Structured RAG (Retrieval Augmented Generation) for conversation-based knowledge processing. It enables you to ingest messages, emails, transcripts, and other conversational content, then query that content using natural language with high precision and recall.

TypeAgent is a Pythonic translation of TypeAgent KnowPro from TypeScript, developed by Microsoft Research.

Why TypeAgent?

Traditional RAG systems rely primarily on semantic similarity search, which can miss important context or return irrelevant results. TypeAgent takes a different approach:

Structured Knowledge Extraction: Automatically extracts entities, topics, actions, and relationships from your content
Multi-Index Architecture: Uses six specialized indexes for different query patterns (semantic refs, properties, timestamps, embeddings, related terms, and conversation threads)
Parallel Search: Executes queries across multiple indexes simultaneously and fuses results intelligently
Graceful Degradation: Works with basic extraction when AI models are unavailable
Provider Agnostic: Supports OpenAI, Azure OpenAI, and other providers via pydantic-ai

Key Benefits

High Precision

Structured knowledge extraction ensures accurate retrieval of relevant information from your conversations

Flexible Storage

Choose between in-memory storage for fast prototyping or SQLite for persistent data

Easy Integration

Simple async API with just two main operations: add_messages_with_indexing() and query()

Production Ready

Built on proven TypeChat and pydantic-ai foundations with comprehensive type annotations

How It Works

Ingest Content

Add messages, emails, or transcripts to a conversation using add_messages_with_indexing()

Extract Knowledge

TypeAgent automatically extracts entities, topics, actions, and relationships using AI models

Build Indexes

Content is indexed across six specialized indexes for fast, accurate retrieval

Query Naturally

Ask questions in natural language using the query() method

Get Answers

Receive AI-generated answers grounded in your indexed content with proper citations

Core Architecture

TypeAgent’s architecture consists of four main components:

1. Storage Layer

Dual-implementation pattern with identical APIs:

MemoryStorageProvider: Fast in-memory storage for prototyping
SqliteStorageProvider: Persistent SQLite-backed storage for production

2. Knowledge Extractor

Multi-mode content processing:

Basic Mode: Rule-based extraction from metadata and structure
AI Mode: LLM-powered entity and relationship extraction
Hybrid Enhancement: Combines both approaches for best results

3. Query Pipeline

Structured RAG system with:

Natural language query translation
Parallel multi-index search
Result fusion and ranking
Context-aware answer generation

4. Integration Layer

Unified conversation interface:

ConversationBase: Main entry point for all operations
create_conversation(): Factory function for easy setup
Async-first API design

Use Cases

Email Analysis

Ingest your email archive and ask questions like “What decisions were made about the product launch?” or “Who discussed the budget in Q4?”

Meeting Transcripts

Process meeting transcripts and query action items, decisions, or topics discussed by specific participants

Customer Support

Index support conversations and quickly find similar issues, resolutions, or expertise by team members

Research & Analysis

Analyze podcast transcripts, interviews, or research conversations to extract insights and themes

What’s Next?

Installation

Get TypeAgent installed with pip, uv, or poetry

Quickstart

Build your first TypeAgent application in minutes

TypeAgent sends input to LLM providers for processing. Do not use it to index confidential information unless you control the LLM endpoint.

Project Status

TypeAgent is an experimental prototype under active development. The API is stabilizing but may change in future releases. It is suitable for:

Research and experimentation
Proof-of-concept applications
Non-confidential data processing
Learning Structured RAG patterns

For production use with sensitive data, ensure you use a private LLM deployment.

Get Started

Core Concepts

Guides

What is TypeAgent?

Why TypeAgent?

Key Benefits

High Precision

Flexible Storage

Easy Integration

Production Ready

How It Works

Core Architecture

1. Storage Layer

2. Knowledge Extractor

3. Query Pipeline

4. Integration Layer

Use Cases

What’s Next?

Installation

Quickstart

Project Status

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​What is TypeAgent?

​Why TypeAgent?

​Key Benefits

High Precision

Flexible Storage

Easy Integration

Production Ready

​How It Works

​Core Architecture

​1. Storage Layer

​2. Knowledge Extractor

​3. Query Pipeline

​4. Integration Layer

​Use Cases

​What’s Next?

Installation

Quickstart

​Project Status

Build docs developers (and LLMs) love

What is TypeAgent?

Why TypeAgent?

Key Benefits

How It Works

Core Architecture

1. Storage Layer

2. Knowledge Extractor

3. Query Pipeline

4. Integration Layer

Use Cases

What’s Next?

Project Status