Overview
This tutorial walks you through building a complete TypeAgent application in two parts:
Ingesting messages into a conversation database
Querying the indexed content using natural language
By the end, you’ll understand the core TypeAgent workflow and be ready to build more complex applications.
Prerequisites
Before starting, ensure you have:
Python 3.12 or later installed
An OpenAI API key (get one here )
TypeAgent installed (pip install typeagent)
If you need help with installation, see the Installation Guide .
Part 1: Ingesting Messages
Let’s build a simple program that ingests conversation messages and indexes them for querying.
Create Sample Data
Create a file named testdata.txt with sample conversation content: STEVE We should really make a Python library for Structured RAG.
UMESH Who would be a good person to do the Python library?
GUIDO I volunteer to do the Python library. Give me a few months.
Each line follows the format: SPEAKER message text
Create Ingestion Script
Create a file named ingest.py: from dotenv import load_dotenv
from typeagent import create_conversation
from typeagent.transcripts.transcript import (
TranscriptMessage,
TranscriptMessageMeta,
)
load_dotenv() # Load API keys from .env file
def read_messages ( filename ) -> list[TranscriptMessage]:
"""Parse text file into TranscriptMessage objects."""
messages: list[TranscriptMessage] = []
with open (filename, "r" ) as f:
for line in f:
# Split line into speaker and text
speaker, text_chunk = line.split( None , 1 )
message = TranscriptMessage(
text_chunks = [text_chunk],
metadata = TranscriptMessageMeta( speaker = speaker),
)
messages.append(message)
return messages
async def main ():
# Create a conversation with SQLite storage
conversation = await create_conversation( "demo.db" , TranscriptMessage)
# Read and index messages
messages = read_messages( "testdata.txt" )
print ( f "Indexing { len (messages) } messages..." )
# Add messages with automatic knowledge extraction and indexing
results = await conversation.add_messages_with_indexing(messages)
print ( f "Indexed { results.messages_added } messages." )
print ( f "Got { results.semrefs_added } semantic refs." )
if __name__ == "__main__" :
import asyncio
asyncio.run(main())
Set Up Environment
Create a .env file with your OpenAI credentials: OPENAI_API_KEY = your-api-key-here
OPENAI_MODEL = gpt-4o
Never commit your .env file to version control! Add it to .gitignore.
Run Ingestion
Execute the ingestion script: Expected output: 0.027s -- Using OpenAI
Indexing 3 messages...
Indexed 3 messages.
Got 24 semantic refs.
The demo.db file now contains your indexed conversation! TypeAgent extracted entities, topics, and relationships automatically.
Understanding the Ingestion Code
Let’s break down the key components:
Creating a Conversation
Message Structure
Indexing Messages
# Create conversation with SQLite persistence
conversation = await create_conversation( "demo.db" , TranscriptMessage)
# For in-memory testing (no persistence):
conversation = await create_conversation( None , TranscriptMessage)
Part 2: Querying the Conversation
Now let’s query the indexed content using natural language.
Create Query Script
Create a file named query.py: from dotenv import load_dotenv
from typeagent import create_conversation
from typeagent.transcripts.transcript import TranscriptMessage
load_dotenv()
async def main ():
# Connect to existing conversation database
conversation = await create_conversation( "demo.db" , TranscriptMessage)
# Ask a question
question = "Who volunteered to do the python library?"
print ( "Q:" , question)
# Query using natural language
answer = await conversation.query(question)
print ( "A:" , answer)
if __name__ == "__main__" :
import asyncio
asyncio.run(main())
Run Query
Execute the query script: Expected output: 0.019s -- Using OpenAI
Q: Who volunteered to do the python library?
A: Guido volunteered to do the Python library.
The answer is generated from your indexed content, not from the LLM’s training data!
Understanding Query Results
The query() method:
Translates your natural language question into structured searches
Queries multiple indexes in parallel (entities, topics, semantic similarity)
Fuses results and ranks them by relevance
Generates an answer grounded in your indexed content
If no relevant content is found, the response starts with "No answer found:"
Complete Working Example
Here’s an interactive demo that combines ingestion and querying:
#!/usr/bin/env python3
import asyncio
from dotenv import load_dotenv
from typeagent import create_conversation
from typeagent.transcripts.transcript import TranscriptMessage, TranscriptMessageMeta
load_dotenv()
async def main ():
"""Interactive TypeAgent demo."""
print ( "Creating conversation..." )
conv = await create_conversation(
None , # In-memory for this demo
TranscriptMessage,
name = "Demo Conversation" ,
)
# Add sample messages about Python
messages = [
TranscriptMessage(
text_chunks = [ "Welcome to the Python programming tutorial." ],
metadata = TranscriptMessageMeta( speaker = "Instructor" ),
),
TranscriptMessage(
text_chunks = [ "Today we'll learn about async/await in Python." ],
metadata = TranscriptMessageMeta( speaker = "Instructor" ),
),
TranscriptMessage(
text_chunks = [ "Python is a great language for beginners and experts alike." ],
metadata = TranscriptMessageMeta( speaker = "Instructor" ),
),
TranscriptMessage(
text_chunks = [ "The async keyword is used to define asynchronous functions." ],
metadata = TranscriptMessageMeta( speaker = "Instructor" ),
),
TranscriptMessage(
text_chunks = [
"You use await to wait for asynchronous operations to complete."
],
metadata = TranscriptMessageMeta( speaker = "Instructor" ),
),
]
print ( "Adding messages and building indexes..." )
result = await conv.add_messages_with_indexing(messages)
print ( f "Conversation ready with { await conv.messages.size() } messages." )
print (
f "Added { result.messages_added } messages, { result.semrefs_added } semantic refs"
)
print ()
# Interactive query loop
print ( "You can now ask questions about the conversation." )
print ( "Type 'quit' or 'exit' to stop. \n " )
while True :
try :
question: str = input ( "typeagent> " )
if not question.strip():
continue
if question.strip().lower() in ( "quit" , "exit" , "q" ):
break
# Query the conversation
answer: str = await conv.query(question)
print (answer)
print ()
except EOFError :
print ()
break
except KeyboardInterrupt :
print ( " \n Exiting..." )
break
if __name__ == "__main__" :
asyncio.run(main())
Run it:
Try questions like:
“What is this tutorial about?”
“What does the async keyword do?”
“Who is the speaker?”
Advanced: Working with Different Message Types
TypeAgent supports various message types beyond simple transcripts:
ConversationMessage
TranscriptMessage
Multiple Text Chunks
The base message type with full flexibility: from typeagent.knowpro.universal_message import (
ConversationMessage,
ConversationMessageMeta,
)
message = ConversationMessage(
text_chunks = [ "First part" , "Second part" ],
tags = [ "important" , "meeting" ],
timestamp = "2026-03-06T15:30:00z" ,
metadata = ConversationMessageMeta(
speaker = "Alice" ,
recipients = [ "Bob" , "Charlie" ],
),
)
Alias for ConversationMessage, optimized for transcripts: from typeagent.transcripts.transcript import (
TranscriptMessage,
TranscriptMessageMeta,
)
message = TranscriptMessage(
text_chunks = [ "Spoken text here" ],
metadata = TranscriptMessageMeta( speaker = "Speaker" ),
)
Split long messages into chunks for better processing: message = TranscriptMessage(
text_chunks = [
"First paragraph of the message." ,
"Second paragraph continues." ,
"Final thoughts in the third chunk." ,
],
metadata = TranscriptMessageMeta( speaker = "Alice" ),
)
Database Storage Options
TypeAgent offers flexible storage backends:
SQLite (Persistent)
In-Memory (Fast)
Custom Location
# Data persists across program runs
conv = await create_conversation( "my_data.db" , TranscriptMessage)
Query Patterns
Here are common query patterns that work well with TypeAgent:
# Find what specific people said or did
await conv.query( "What did Guido say about Python?" )
await conv.query( "Who volunteered for the project?" )
# Search by subject or theme
await conv.query( "What was discussed about async programming?" )
await conv.query( "Tell me about the budget conversation" )
# Find decisions, actions, or commitments
await conv.query( "What decisions were made?" )
await conv.query( "Who agreed to lead the initiative?" )
# Time-based filtering (requires timestamp metadata)
await conv.query( "What happened in March?" )
await conv.query( "Recent discussions about the API" )
Error Handling
Handle common scenarios gracefully:
async def safe_query ( conversation , question : str ) -> str :
"""Query with error handling."""
try :
answer = await conversation.query(question)
# Check for "no answer" responses
if answer.startswith( "No answer found:" ):
print ( f "Could not find relevant information for: { question } " )
return None
return answer
except Exception as e:
print ( f "Query failed: { e } " )
return None
Batch Indexing : Add multiple messages at once for better performance:# Good: Single batch
results = await conv.add_messages_with_indexing(all_messages)
# Avoid: Multiple single calls
for msg in all_messages:
await conv.add_messages_with_indexing([msg]) # Slower!
Reuse Conversations : Create the conversation once and reuse it:# Good: Reuse conversation object
conv = await create_conversation( "demo.db" , TranscriptMessage)
for question in questions:
await conv.query(question)
# Avoid: Recreating unnecessarily
for question in questions:
conv = await create_conversation( "demo.db" , TranscriptMessage) # Wasteful!
await conv.query(question)
Troubleshooting
No answer found for obvious questions
Cause : Insufficient indexed content or poor message structureSolution :
Ensure messages have clear speaker metadata
Add more context in text_chunks
Try rephrasing the query
Slow indexing performance
Cause : Multiple processes accessing same SQLite databaseSolution :
Use in-memory storage (None) for parallel testing
Implement file locking if needed
Consider a client-server database for production
Next Steps
Now that you’ve built your first TypeAgent application, explore more advanced features:
API Reference Dive deep into the complete API documentation
Email Ingestion Learn how to index email conversations
Knowledge Extraction Understand how AI extracts structured knowledge
Advanced Queries Master complex query patterns
Additional Resources
You’re now ready to build powerful knowledge processing applications with TypeAgent!