Skip to main content
TypeAgent provides flexible configuration options through ConversationSettings to control indexing, knowledge extraction, and storage behavior.

ConversationSettings

The ConversationSettings class is the primary configuration interface:
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.aitools.model_adapters import create_embedding_model

# Default configuration
settings = ConversationSettings()

# Custom embedding model
embedding_model = create_embedding_model("openai:text-embedding-3-small")
settings = ConversationSettings(model=embedding_model)

# With storage provider
from typeagent.storage.sqlite import SqliteStorageProvider

storage_provider = SqliteStorageProvider(
    db_path="conversation.db",
    message_type=TranscriptMessage
)
settings = ConversationSettings(
    model=embedding_model,
    storage_provider=storage_provider
)

Embedding Model Configuration

Configure the embedding model used for semantic search:
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.knowpro.convsettings import ConversationSettings

# OpenAI models (default: text-embedding-ada-002)
model = create_embedding_model("openai:text-embedding-ada-002")
model = create_embedding_model("openai:text-embedding-3-small")
model = create_embedding_model("openai:text-embedding-3-large")

# Use in settings
settings = ConversationSettings(model=model)
Once you create a database with a specific embedding model, you must use the same model for all subsequent operations. Mixing embedding models will cause errors.

Embedding Model Properties

Access embedding model information:
print(f"Model name: {settings.embedding_model.model_name}")
print(f"Embedding dimensions: {settings.embedding_model.dimensions}")

Knowledge Extraction Settings

Control automatic knowledge extraction from messages:
from typeagent.knowpro.convsettings import ConversationSettings

settings = ConversationSettings()

# Enable/disable automatic knowledge extraction
settings.semantic_ref_index_settings.auto_extract_knowledge = True  # Default

# Configure batch size for concurrent extraction
settings.semantic_ref_index_settings.batch_size = 4  # Process 4 messages concurrently

Extraction Modes

Knowledge is extracted automatically during ingestion:
settings = ConversationSettings()
settings.semantic_ref_index_settings.auto_extract_knowledge = True

# Extraction happens automatically
result = await conversation.add_messages_with_indexing(messages)
print(f"Extracted {result.semrefs_added} semantic references")

Message Text Index Settings

Configure the message text index for semantic search:
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings

# Access message text index settings
msg_settings = settings.message_text_index_settings

# View underlying embedding settings
embedding_settings = msg_settings.embedding_index_settings
print(f"Min score threshold: {embedding_settings.min_score}")
print(f"Max matches: {embedding_settings.max_matches}")

Default Settings

TypeAgent uses these defaults for message text indexing:
from typeagent.knowpro.convsettings import ConversationSettings

settings = ConversationSettings()

# Message text index
# - min_score: 0.7 (70% similarity threshold)
# - max_matches: unlimited

# Related term index  
# - min_score: 0.85 (85% similarity threshold)
# - max_matches: 50
Configure fuzzy matching and synonym expansion:
settings = ConversationSettings()

# Access related terms settings
related_settings = settings.related_term_index_settings

# View embedding settings
embedding_settings = related_settings.embedding_index_settings
print(f"Min score: {embedding_settings.min_score}")  # 0.85
print(f"Max matches: {embedding_settings.max_matches}")  # 50
The related terms index enables:
  • Fuzzy matching of entity names
  • Synonym expansion for verbs and actions
  • Alias resolution (e.g., “Dr. Smith” → “John Smith”)

Thread Detection Settings

Configure conversation thread detection:
settings = ConversationSettings()

# Thread settings use the same embedding model
thread_settings = settings.thread_settings
print(f"Thread detection min score: {thread_settings.min_score}")  # 0.85

Storage Provider Configuration

1
Step 1: Choose a Storage Provider
2
Select between in-memory and SQLite storage:
3
from typeagent.knowpro.convsettings import ConversationSettings

settings = ConversationSettings()

# Option 1: Let TypeAgent choose (SQLite if dbname provided, else Memory)
provider = await settings.get_storage_provider()

# Option 2: Explicitly set storage provider
from typeagent.storage.sqlite import SqliteStorageProvider

provider = SqliteStorageProvider(
    db_path="conversation.db",
    message_type=TranscriptMessage
)
settings.storage_provider = provider
4
Step 2: Configure Provider Settings
5
Pass settings when creating storage:
6
from typeagent.storage.utils import create_storage_provider

provider = await create_storage_provider(
    message_text_settings=settings.message_text_index_settings,
    related_terms_settings=settings.related_term_index_settings,
    dbname="conversation.db",
    message_type=TranscriptMessage
)
7
Step 3: Access Provider Properties
8
Inspect storage provider configuration:
9
# Get storage provider
provider = settings.storage_provider

# Access collections
messages = await provider.get_message_collection()
semantic_refs = await provider.get_semantic_ref_collection()

# Access indexes
semantic_ref_index = await provider.get_semantic_ref_index()
message_text_index = await provider.get_message_text_index()
related_terms_index = await provider.get_related_terms_index()

Conversation Metadata

Configure metadata for tracking and organization:
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent import create_conversation

# Create with metadata
conversation = await create_conversation(
    dbname="demo.db",
    message_type=TranscriptMessage,
    name="Q1 2024 Team Meetings",
    tags=["engineering", "weekly-sync"],
    extras={
        "team": "Platform Engineering",
        "quarter": "Q1-2024"
    }
)

Reading Metadata

Access conversation metadata:
# Get metadata from storage provider
metadata = await provider.get_conversation_metadata()

print(f"Name: {metadata.name_tag}")
print(f"Tags: {metadata.tags}")
print(f"Created: {metadata.created_at}")
print(f"Updated: {metadata.updated_at}")
print(f"Embedding model: {metadata.embedding_model}")
print(f"Schema version: {metadata.schema_version}")

# Access custom fields
if metadata.extra:
    for key, value in metadata.extra.items():
        print(f"{key}: {value}")

Updating Metadata

Modify conversation metadata:
# Update metadata fields
await provider.set_conversation_metadata(
    name_tag="Updated Name",
    tag=["new-tag", "another-tag"],
    custom_field="custom value"
)

# Update timestamps
from datetime import datetime, timezone

await provider.update_conversation_timestamps(
    updated_at=datetime.now(timezone.utc)
)

Complete Configuration Example

Here’s a complete example with all configuration options:
import asyncio
from datetime import datetime, timezone
from dotenv import load_dotenv

from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.storage.utils import create_storage_provider
from typeagent.transcripts.transcript import TranscriptMessage
from typeagent import create_conversation

load_dotenv()

async def main():
    # 1. Configure embedding model
    embedding_model = create_embedding_model("openai:text-embedding-3-small")
    
    # 2. Create conversation settings
    settings = ConversationSettings(model=embedding_model)
    
    # 3. Configure knowledge extraction
    settings.semantic_ref_index_settings.auto_extract_knowledge = True
    settings.semantic_ref_index_settings.batch_size = 4
    
    # 4. Create metadata
    metadata = ConversationMetadata(
        name_tag="Engineering Standup",
        tags=["daily", "engineering"],
        extra={
            "team": "Backend",
            "quarter": "Q1-2024"
        }
    )
    
    # 5. Create storage provider
    provider = await create_storage_provider(
        message_text_settings=settings.message_text_index_settings,
        related_terms_settings=settings.related_term_index_settings,
        dbname="standup.db",
        message_type=TranscriptMessage,
        metadata=metadata
    )
    
    # 6. Set provider in settings
    settings.storage_provider = provider
    
    # 7. Create conversation
    conversation = await create_conversation(
        dbname=None,  # Already have provider
        message_type=TranscriptMessage,
        settings=settings,
        name="Engineering Standup",
        tags=["daily", "engineering"]
    )
    
    print(f"Conversation configured successfully")
    print(f"Embedding model: {settings.embedding_model.model_name}")
    print(f"Auto extract: {settings.semantic_ref_index_settings.auto_extract_knowledge}")
    print(f"Batch size: {settings.semantic_ref_index_settings.batch_size}")

if __name__ == "__main__":
    asyncio.run(main())

Environment Variables

Use environment variables to configure API keys and model access.
Create a .env file:
# OpenAI API
OPENAI_API_KEY=sk-...
OPENAI_ORG_ID=org-...  # Optional

# Azure OpenAI (alternative)
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://....openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=...
Load in your application:
from dotenv import load_dotenv

load_dotenv()  # Load from .env file

# Now create settings - API keys are automatically picked up
settings = ConversationSettings()

Configuration Best Practices

1
Use consistent embedding models
2
Always use the same embedding model for a database:
3
# Store model name with metadata
await provider.set_conversation_metadata(
    embedding_name="text-embedding-3-small"
)
4
Set appropriate batch sizes
5
Balance throughput and resource usage:
6
# For development: smaller batches, faster feedback
settings.semantic_ref_index_settings.batch_size = 2

# For production: larger batches, better throughput
settings.semantic_ref_index_settings.batch_size = 10
7
Enable knowledge extraction
8
Always enable for interactive applications:
9
# Enable for query capabilities
settings.semantic_ref_index_settings.auto_extract_knowledge = True
10
Use meaningful metadata
11
Add context for organization:
12
conversation = await create_conversation(
    dbname="project-sync.db",
    message_type=TranscriptMessage,
    name="Project Sync - 2024-01-15",
    tags=["project-alpha", "sync-meeting"],
    extras={
        "project": "Alpha",
        "meeting_type": "sync",
        "date": "2024-01-15"
    }
)

Next Steps

Build docs developers (and LLMs) love