TypeAgent provides flexible configuration options through ConversationSettings to control indexing, knowledge extraction, and storage behavior.
ConversationSettings
The ConversationSettings class is the primary configuration interface:
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.aitools.model_adapters import create_embedding_model
# Default configuration
settings = ConversationSettings()
# Custom embedding model
embedding_model = create_embedding_model("openai:text-embedding-3-small")
settings = ConversationSettings(model=embedding_model)
# With storage provider
from typeagent.storage.sqlite import SqliteStorageProvider
storage_provider = SqliteStorageProvider(
db_path="conversation.db",
message_type=TranscriptMessage
)
settings = ConversationSettings(
model=embedding_model,
storage_provider=storage_provider
)
Embedding Model Configuration
Configure the embedding model used for semantic search:
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.knowpro.convsettings import ConversationSettings
# OpenAI models (default: text-embedding-ada-002)
model = create_embedding_model("openai:text-embedding-ada-002")
model = create_embedding_model("openai:text-embedding-3-small")
model = create_embedding_model("openai:text-embedding-3-large")
# Use in settings
settings = ConversationSettings(model=model)
Once you create a database with a specific embedding model, you must use the same model for all subsequent operations. Mixing embedding models will cause errors.
Embedding Model Properties
Access embedding model information:
print(f"Model name: {settings.embedding_model.model_name}")
print(f"Embedding dimensions: {settings.embedding_model.dimensions}")
Control automatic knowledge extraction from messages:
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Enable/disable automatic knowledge extraction
settings.semantic_ref_index_settings.auto_extract_knowledge = True # Default
# Configure batch size for concurrent extraction
settings.semantic_ref_index_settings.batch_size = 4 # Process 4 messages concurrently
Automatic Mode
Manual Mode
Knowledge is extracted automatically during ingestion:settings = ConversationSettings()
settings.semantic_ref_index_settings.auto_extract_knowledge = True
# Extraction happens automatically
result = await conversation.add_messages_with_indexing(messages)
print(f"Extracted {result.semrefs_added} semantic references")
Extract knowledge manually after ingestion:settings = ConversationSettings()
settings.semantic_ref_index_settings.auto_extract_knowledge = False
# Add messages without extraction
await conversation.messages.extend(messages)
# Extract knowledge later
from typeagent.knowpro import secindex
await secindex.extract_and_index_knowledge(
conversation,
settings,
messages
)
Message Text Index Settings
Configure the message text index for semantic search:
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings
# Access message text index settings
msg_settings = settings.message_text_index_settings
# View underlying embedding settings
embedding_settings = msg_settings.embedding_index_settings
print(f"Min score threshold: {embedding_settings.min_score}")
print(f"Max matches: {embedding_settings.max_matches}")
Default Settings
TypeAgent uses these defaults for message text indexing:
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Message text index
# - min_score: 0.7 (70% similarity threshold)
# - max_matches: unlimited
# Related term index
# - min_score: 0.85 (85% similarity threshold)
# - max_matches: 50
Configure fuzzy matching and synonym expansion:
settings = ConversationSettings()
# Access related terms settings
related_settings = settings.related_term_index_settings
# View embedding settings
embedding_settings = related_settings.embedding_index_settings
print(f"Min score: {embedding_settings.min_score}") # 0.85
print(f"Max matches: {embedding_settings.max_matches}") # 50
The related terms index enables:
- Fuzzy matching of entity names
- Synonym expansion for verbs and actions
- Alias resolution (e.g., “Dr. Smith” → “John Smith”)
Thread Detection Settings
Configure conversation thread detection:
settings = ConversationSettings()
# Thread settings use the same embedding model
thread_settings = settings.thread_settings
print(f"Thread detection min score: {thread_settings.min_score}") # 0.85
Storage Provider Configuration
Step 1: Choose a Storage Provider
Select between in-memory and SQLite storage:
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Option 1: Let TypeAgent choose (SQLite if dbname provided, else Memory)
provider = await settings.get_storage_provider()
# Option 2: Explicitly set storage provider
from typeagent.storage.sqlite import SqliteStorageProvider
provider = SqliteStorageProvider(
db_path="conversation.db",
message_type=TranscriptMessage
)
settings.storage_provider = provider
Pass settings when creating storage:
from typeagent.storage.utils import create_storage_provider
provider = await create_storage_provider(
message_text_settings=settings.message_text_index_settings,
related_terms_settings=settings.related_term_index_settings,
dbname="conversation.db",
message_type=TranscriptMessage
)
Step 3: Access Provider Properties
Inspect storage provider configuration:
# Get storage provider
provider = settings.storage_provider
# Access collections
messages = await provider.get_message_collection()
semantic_refs = await provider.get_semantic_ref_collection()
# Access indexes
semantic_ref_index = await provider.get_semantic_ref_index()
message_text_index = await provider.get_message_text_index()
related_terms_index = await provider.get_related_terms_index()
Configure metadata for tracking and organization:
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent import create_conversation
# Create with metadata
conversation = await create_conversation(
dbname="demo.db",
message_type=TranscriptMessage,
name="Q1 2024 Team Meetings",
tags=["engineering", "weekly-sync"],
extras={
"team": "Platform Engineering",
"quarter": "Q1-2024"
}
)
Access conversation metadata:
# Get metadata from storage provider
metadata = await provider.get_conversation_metadata()
print(f"Name: {metadata.name_tag}")
print(f"Tags: {metadata.tags}")
print(f"Created: {metadata.created_at}")
print(f"Updated: {metadata.updated_at}")
print(f"Embedding model: {metadata.embedding_model}")
print(f"Schema version: {metadata.schema_version}")
# Access custom fields
if metadata.extra:
for key, value in metadata.extra.items():
print(f"{key}: {value}")
Modify conversation metadata:
# Update metadata fields
await provider.set_conversation_metadata(
name_tag="Updated Name",
tag=["new-tag", "another-tag"],
custom_field="custom value"
)
# Update timestamps
from datetime import datetime, timezone
await provider.update_conversation_timestamps(
updated_at=datetime.now(timezone.utc)
)
Complete Configuration Example
Here’s a complete example with all configuration options:
import asyncio
from datetime import datetime, timezone
from dotenv import load_dotenv
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.storage.utils import create_storage_provider
from typeagent.transcripts.transcript import TranscriptMessage
from typeagent import create_conversation
load_dotenv()
async def main():
# 1. Configure embedding model
embedding_model = create_embedding_model("openai:text-embedding-3-small")
# 2. Create conversation settings
settings = ConversationSettings(model=embedding_model)
# 3. Configure knowledge extraction
settings.semantic_ref_index_settings.auto_extract_knowledge = True
settings.semantic_ref_index_settings.batch_size = 4
# 4. Create metadata
metadata = ConversationMetadata(
name_tag="Engineering Standup",
tags=["daily", "engineering"],
extra={
"team": "Backend",
"quarter": "Q1-2024"
}
)
# 5. Create storage provider
provider = await create_storage_provider(
message_text_settings=settings.message_text_index_settings,
related_terms_settings=settings.related_term_index_settings,
dbname="standup.db",
message_type=TranscriptMessage,
metadata=metadata
)
# 6. Set provider in settings
settings.storage_provider = provider
# 7. Create conversation
conversation = await create_conversation(
dbname=None, # Already have provider
message_type=TranscriptMessage,
settings=settings,
name="Engineering Standup",
tags=["daily", "engineering"]
)
print(f"Conversation configured successfully")
print(f"Embedding model: {settings.embedding_model.model_name}")
print(f"Auto extract: {settings.semantic_ref_index_settings.auto_extract_knowledge}")
print(f"Batch size: {settings.semantic_ref_index_settings.batch_size}")
if __name__ == "__main__":
asyncio.run(main())
Environment Variables
Use environment variables to configure API keys and model access.
Create a .env file:
# OpenAI API
OPENAI_API_KEY=sk-...
OPENAI_ORG_ID=org-... # Optional
# Azure OpenAI (alternative)
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://....openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=...
Load in your application:
from dotenv import load_dotenv
load_dotenv() # Load from .env file
# Now create settings - API keys are automatically picked up
settings = ConversationSettings()
Configuration Best Practices
Use consistent embedding models
Always use the same embedding model for a database:
# Store model name with metadata
await provider.set_conversation_metadata(
embedding_name="text-embedding-3-small"
)
Set appropriate batch sizes
Balance throughput and resource usage:
# For development: smaller batches, faster feedback
settings.semantic_ref_index_settings.batch_size = 2
# For production: larger batches, better throughput
settings.semantic_ref_index_settings.batch_size = 10
Always enable for interactive applications:
# Enable for query capabilities
settings.semantic_ref_index_settings.auto_extract_knowledge = True
Add context for organization:
conversation = await create_conversation(
dbname="project-sync.db",
message_type=TranscriptMessage,
name="Project Sync - 2024-01-15",
tags=["project-alpha", "sync-meeting"],
extras={
"project": "Alpha",
"meeting_type": "sync",
"date": "2024-01-15"
}
)
Next Steps