Overview
ConversationSettings controls all aspects of conversation knowledge extraction and indexing behavior, including embedding models, batch sizes, and index configurations.
Import
from typeagent.knowpro.convsettings import ConversationSettings
Class Definition
class ConversationSettings :
def __init__ (
self ,
model : IEmbeddingModel | None = None ,
storage_provider : IStorageProvider | None = None ,
)
Constructor Parameters
model
IEmbeddingModel | None
default: "None"
The embedding model to use for all indexes. If None, creates a default model using create_embedding_model(). Important: All indexes share the same embedding model to ensure consistency and share the embedding cache.Example: from typeagent.aitools.model_adapters import create_embedding_model
model = create_embedding_model( "openai:text-embedding-3-small" )
settings = ConversationSettings( model = model)
storage_provider
IStorageProvider | None
default: "None"
Optional storage provider. If None, will be created lazily when needed. Note: When using create_conversation(), the storage provider is automatically created and set.
Properties
embedding_model
@ property
def embedding_model ( self ) -> IEmbeddingModel
The shared embedding model used by all indexes.
The embedding model instance.
related_term_index_settings: RelatedTermIndexSettings
Configuration for the related terms index (synonyms and fuzzy matching).
Type: RelatedTermIndexSettings
Default Configuration:
RelatedTermIndexSettings(
TextEmbeddingIndexSettings(
model = embedding_model,
min_score = 0.85 ,
max_matches = 50
)
)
Minimum similarity score threshold for related term matches (0.0 to 1.0).
Maximum number of related terms to return per query.
thread_settings
thread_settings: TextEmbeddingIndexSettings
Configuration for conversation threads index.
Type: TextEmbeddingIndexSettings
Default Configuration:
TextEmbeddingIndexSettings(
model = embedding_model,
min_score = 0.85
)
message_text_index_settings
message_text_index_settings: MessageTextIndexSettings
Configuration for the message text semantic search index.
Type: MessageTextIndexSettings
Default Configuration:
MessageTextIndexSettings(
TextEmbeddingIndexSettings(
model = embedding_model,
min_score = 0.7
)
)
Message text index uses a lower threshold (0.7) than related terms (0.85) to cast a wider net when searching message content.
semantic_ref_index_settings
semantic_ref_index_settings: SemanticRefIndexSettings
Configuration for semantic reference extraction and indexing.
Type: SemanticRefIndexSettings
Default Configuration:
SemanticRefIndexSettings(
batch_size = 4 ,
auto_extract_knowledge = True ,
knowledge_extractor = None # Uses default if None
)
Fields:
Number of message chunks to process concurrently during knowledge extraction. Higher values increase throughput but use more memory. Effective max concurrency for LLM calls.
Whether to automatically extract knowledge (entities, actions, topics) from messages using LLM.
True: Full knowledge extraction enabled (default in create_conversation())
False: Only metadata knowledge extracted (speakers, recipients)
Custom knowledge extractor. If None, uses default KnowledgeExtractor().
storage_provider
@ property
def storage_provider ( self ) -> IStorageProvider
The storage provider managing persistence.
The storage provider instance.
Raises: RuntimeError if accessed before initialization.
Setter:
@storage_provider.setter
def storage_provider ( self , value : IStorageProvider) -> None
Methods
get_storage_provider
async def get_storage_provider ( self ) -> IStorageProvider
Get or create the storage provider asynchronously.
The storage provider. Creates an in-memory provider if none was set.
Behavior:
If _storage_provider is set: returns it
If _storage_provider is None: creates MemoryStorageProvider with current settings
Helper Classes
MessageTextIndexSettings
from typeagent.knowpro.convsettings import MessageTextIndexSettings
@dataclass
class MessageTextIndexSettings :
embedding_index_settings: TextEmbeddingIndexSettings
Wrapper for message text index embedding configuration.
Constructor:
def __init__ ( self , embedding_index_settings : TextEmbeddingIndexSettings)
from typeagent.knowpro.convsettings import RelatedTermIndexSettings
@dataclass
class RelatedTermIndexSettings :
embedding_index_settings: TextEmbeddingIndexSettings
Wrapper for related terms index embedding configuration.
Constructor:
def __init__ ( self , embedding_index_settings : TextEmbeddingIndexSettings)
SemanticRefIndexSettings
from typeagent.knowpro.convsettings import SemanticRefIndexSettings
@dataclass
class SemanticRefIndexSettings :
batch_size: int
auto_extract_knowledge: bool
knowledge_extractor: IKnowledgeExtractor | None = None
Configuration for knowledge extraction behavior.
TextEmbeddingIndexSettings
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings
class TextEmbeddingIndexSettings :
def __init__ (
self ,
embedding_model : IEmbeddingModel | None = None ,
min_score : float = 0.7 ,
max_matches : int | None = None
)
Low-level embedding index configuration.
embedding_model
IEmbeddingModel | None
default: "None"
The embedding model. If None, creates default model.
Minimum cosine similarity threshold (0.0 to 1.0) for matches.
Maximum number of results to return. If None, returns all matches above threshold.
Usage Examples
Default Settings
from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage
# Uses default settings (auto_extract_knowledge=True)
conv = await create_conversation(
dbname = "chat.db" ,
message_type = ConversationMessage
)
Custom Embedding Model
from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.aitools.model_adapters import create_embedding_model
# Use a different embedding model
model = create_embedding_model( "openai:text-embedding-3-large" )
settings = ConversationSettings( model = model)
conv = await create_conversation(
dbname = "chat.db" ,
message_type = ConversationMessage,
settings = settings
)
Adjust Batch Size
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Increase concurrency for faster processing (uses more memory)
settings.semantic_ref_index_settings.batch_size = 8
conv = await create_conversation(
dbname = "large_corpus.db" ,
message_type = ConversationMessage,
settings = settings
)
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Only extract metadata knowledge (no LLM calls)
settings.semantic_ref_index_settings.auto_extract_knowledge = False
conv = await create_conversation(
dbname = "metadata_only.db" ,
message_type = ConversationMessage,
settings = settings
)
Adjust Search Thresholds
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings
from typeagent.aitools.model_adapters import create_embedding_model
model = create_embedding_model()
settings = ConversationSettings( model = model)
# Stricter message text search (fewer, more relevant results)
settings.message_text_index_settings.embedding_index_settings.min_score = 0.85
settings.message_text_index_settings.embedding_index_settings.max_matches = 20
# More lenient related terms (cast wider net)
settings.related_term_index_settings.embedding_index_settings.min_score = 0.75
settings.related_term_index_settings.embedding_index_settings.max_matches = 100
conv = await create_conversation(
dbname = "tuned.db" ,
message_type = ConversationMessage,
settings = settings
)
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.knowpro.convknowledge import KnowledgeExtractor
from typeagent.aitools.model_adapters import create_chat_model
# Use a different chat model for knowledge extraction
chat_model = create_chat_model( "anthropic:claude-sonnet-4-20250514" )
extractor = KnowledgeExtractor( model = chat_model)
settings = ConversationSettings()
settings.semantic_ref_index_settings.knowledge_extractor = extractor
conv = await create_conversation(
dbname = "custom_extractor.db" ,
message_type = ConversationMessage,
settings = settings
)
Complete Custom Configuration
from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage
from typeagent.knowpro.convsettings import (
ConversationSettings,
MessageTextIndexSettings,
RelatedTermIndexSettings,
)
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings
from typeagent.aitools.model_adapters import (
create_embedding_model,
create_chat_model
)
from typeagent.knowpro.convknowledge import KnowledgeExtractor
# Configure models
embedding_model = create_embedding_model( "openai:text-embedding-3-small" )
chat_model = create_chat_model( "openai:gpt-4o" )
# Create base embedding settings
base_settings = TextEmbeddingIndexSettings(
embedding_model = embedding_model,
min_score = 0.8 ,
max_matches = 50
)
# Create settings instance
settings = ConversationSettings( model = embedding_model)
# Configure message text index
settings.message_text_index_settings = MessageTextIndexSettings(
TextEmbeddingIndexSettings(
embedding_model = embedding_model,
min_score = 0.75 , # More lenient for message search
max_matches = 30
)
)
# Configure related terms index
settings.related_term_index_settings = RelatedTermIndexSettings(
TextEmbeddingIndexSettings(
embedding_model = embedding_model,
min_score = 0.85 , # Stricter for term relationships
max_matches = 100
)
)
# Configure knowledge extraction
extractor = KnowledgeExtractor( model = chat_model)
settings.semantic_ref_index_settings.batch_size = 6
settings.semantic_ref_index_settings.auto_extract_knowledge = True
settings.semantic_ref_index_settings.knowledge_extractor = extractor
# Create conversation with custom settings
conv = await create_conversation(
dbname = "fully_custom.db" ,
message_type = ConversationMessage,
settings = settings
)
print ( f "Using embedding model: { settings.embedding_model.model_name } " )
print ( f "Message search threshold: { settings.message_text_index_settings.embedding_index_settings.min_score } " )
print ( f "Knowledge extraction batch size: { settings.semantic_ref_index_settings.batch_size } " )
Batch Size Increase batch_size (e.g., 8-16) for faster knowledge extraction on powerful machines.Decrease batch_size (e.g., 2-4) to reduce memory usage and API rate limiting.
Search Thresholds Higher min_score (e.g., 0.85-0.9) for more precise, relevant results.Lower min_score (e.g., 0.6-0.75) for broader recall and more matches.
Embedding Model text-embedding-3-small : Faster, cheaper, good quality (1536 dims)text-embedding-3-large : Best quality, slower, more expensive (3072 dims)
Auto Extraction Enabled : Full knowledge extraction with LLM (slower, more comprehensive)Disabled : Metadata only (faster, no LLM costs, basic knowledge)
Best Practices
Share Embedding Model: Always pass the same model instance to ConversationSettings to share the embedding cache across all indexes.
Tune for Use Case:
High precision: Increase min_score thresholds
High recall: Decrease min_score thresholds
Fast ingestion: Increase batch_size, use smaller embedding model
Quality extraction: Use GPT-4 for knowledge extraction
Monitor Costs:
Disable auto_extract_knowledge for cost savings (metadata only)
Use smaller embedding models for large corpora
Batch process messages to reduce API calls
Test Thresholds: Evaluate search quality with your specific data before deploying.