Overview
Collections are append-only data structures for storing messages and semantic references. They provide both iteration and random access by ordinal number.
Base Protocols
IReadonlyCollection
class IReadonlyCollection[T, TOrdinal](AsyncIterable[T], Protocol)
Base protocol for read-only access to ordered collections.
size
async def size ( self ) -> int
Get the total number of items in the collection.
Number of items currently in the collection.
get_item
async def get_item ( self , arg : TOrdinal) -> T
Retrieve a single item by its ordinal number (0-based index).
The ordinal/index of the item to retrieve.
The item at the specified ordinal.
Raises: IndexError if ordinal is out of range.
get_slice
async def get_slice ( self , start : int , stop : int ) -> list[T]
Retrieve a range of items by ordinal (Python slice semantics).
Starting ordinal (inclusive).
Ending ordinal (exclusive).
List of items in the specified range.
Example:
# Get first 10 items
items = await collection.get_slice( 0 , 10 )
# Get items 100-109
items = await collection.get_slice( 100 , 110 )
# Get all items
size = await collection.size()
items = await collection.get_slice( 0 , size)
get_multiple
async def get_multiple ( self , arg : list[TOrdinal]) -> list[T]
Retrieve multiple items by their ordinals.
List of ordinals to retrieve.
List of items in the same order as the input ordinals.
Example:
# Get specific messages
messages = await collection.get_multiple([ 0 , 5 , 10 , 15 ])
Async Iteration
Collections support async iteration:
async for item in collection:
print (item)
ICollection
class ICollection[T, TOrdinal](IReadonlyCollection[T, TOrdinal], Protocol)
Extends IReadonlyCollection with append operations. Collections are append-only - no deletion or modification.
is_persistent
@ property
def is_persistent ( self ) -> bool
Indicates whether the collection persists across process restarts.
True for SQLite storage
False for in-memory storage
append
async def append ( self , item : T) -> None
Append a single item to the collection.
Example:
msg = ConversationMessage(
text_chunks = [ "Hello world" ],
metadata = ConversationMessageMeta( speaker = "Alice" )
)
await messages.append(msg)
extend
async def extend ( self , items : Iterable[T]) -> None
Append multiple items to the collection.
Default Implementation: Calls append() for each item. SQLite implementations override for batch efficiency.
Example:
messages = [
ConversationMessage( text_chunks = [ "Hello" ], metadata = meta1),
ConversationMessage( text_chunks = [ "World" ], metadata = meta2),
ConversationMessage( text_chunks = [ "!" ], metadata = meta3),
]
await collection.extend(messages)
IMessageCollection
class IMessageCollection[TMessage: IMessage](
ICollection[TMessage, MessageOrdinal],
Protocol
)
Collection interface for conversation messages. Messages are identified by ordinal numbers (MessageOrdinal = int).
Type Parameters
The message type (e.g., ConversationMessage, TranscriptMessage).
Usage
from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage
conv = await create_conversation(
dbname = "chat.db" ,
message_type = ConversationMessage
)
messages: IMessageCollection[ConversationMessage] = conv.messages
# Get collection size
count = await messages.size()
print ( f "Total messages: { count } " )
# Get first message
if count > 0 :
first_msg = await messages.get_item( 0 )
print ( f "First message: { first_msg.text_chunks[ 0 ] } " )
# Get recent messages
recent = await messages.get_slice( max ( 0 , count - 10 ), count)
for msg in recent:
print ( f " { msg.metadata.speaker } : { msg.text_chunks[ 0 ] } " )
# Iterate all messages
async for msg in messages:
print ( f "[ { msg.timestamp } ] { msg.text_chunks[ 0 ] } " )
Complete Example
from typeagent import create_conversation
from typeagent.knowpro.universal_message import (
ConversationMessage,
ConversationMessageMeta
)
# Create conversation
conv = await create_conversation(
dbname = "example.db" ,
message_type = ConversationMessage
)
# Access message collection
messages = conv.messages
# Add single message
msg = ConversationMessage(
text_chunks = [ "Let's discuss the roadmap." ],
metadata = ConversationMessageMeta( speaker = "Alice" ),
tags = [ "planning" ]
)
await messages.append(msg)
# Add multiple messages
new_messages = [
ConversationMessage(
text_chunks = [ "I think we should prioritize performance." ],
metadata = ConversationMessageMeta( speaker = "Bob" )
),
ConversationMessage(
text_chunks = [ "Agreed. Let's profile the hot paths first." ],
metadata = ConversationMessageMeta( speaker = "Alice" )
)
]
await messages.extend(new_messages)
# Query by ordinal
msg_0 = await messages.get_item( 0 )
print ( f "Message 0: { msg_0.text_chunks[ 0 ] } " )
# Get slice
first_three = await messages.get_slice( 0 , 3 )
for i, msg in enumerate (first_three):
speaker = msg.metadata.speaker or "Unknown"
print ( f " { i } : [ { speaker } ] { msg.text_chunks[ 0 ] } " )
# Get specific messages
selected = await messages.get_multiple([ 0 , 2 ])
print ( f "Got { len (selected) } messages" )
ISemanticRefCollection
class ISemanticRefCollection (
ICollection[SemanticRef, SemanticRefOrdinal],
Protocol
)
Collection interface for semantic references (extracted knowledge). Semantic references are identified by ordinal numbers (SemanticRefOrdinal = int).
What are Semantic References?
Semantic references link text locations to extracted knowledge:
Entities - Named entities (people, places, things)
Actions - Verb phrases with subject/object
Topics - Subject matter categories
Tags - User-defined labels
Each semantic reference contains:
Ordinal - Unique sequential ID
Range - Text location (message and chunk ordinals)
Knowledge - The extracted knowledge object
Usage
from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage
conv = await create_conversation(
dbname = "chat.db" ,
message_type = ConversationMessage
)
semrefs = conv.semantic_refs
# Get collection size
count = await semrefs.size()
print ( f "Total semantic references: { count } " )
# Get specific semantic reference
if count > 0 :
semref = await semrefs.get_item( 0 )
print ( f "Knowledge type: { semref.knowledge.knowledge_type } " )
print ( f "Text range: { semref.range } " )
# Access knowledge details
if hasattr (semref.knowledge, 'name' ):
print ( f "Entity name: { semref.knowledge.name } " )
elif hasattr (semref.knowledge, 'text' ):
print ( f "Topic text: { semref.knowledge.text } " )
# Iterate all semantic references
async for semref in semrefs:
knowledge_type = semref.knowledge.knowledge_type
print ( f "[ { semref.semantic_ref_ordinal } ] { knowledge_type } : { semref.knowledge } " )
from typeagent import create_conversation
from typeagent.knowpro.universal_message import (
ConversationMessage,
ConversationMessageMeta
)
# Create and populate conversation
conv = await create_conversation(
dbname = "knowledge_demo.db" ,
message_type = ConversationMessage
)
messages = [
ConversationMessage(
text_chunks = [ "Alice visited Paris last summer." ],
metadata = ConversationMessageMeta( speaker = "Bob" )
),
ConversationMessage(
text_chunks = [ "She really enjoyed the Louvre museum." ],
metadata = ConversationMessageMeta( speaker = "Bob" )
)
]
# Add messages with knowledge extraction
result = await conv.add_messages_with_indexing(messages)
print ( f "Extracted { result.semrefs_added } semantic references" )
# Explore extracted knowledge
semrefs = conv.semantic_refs
entity_count = 0
action_count = 0
topic_count = 0
async for semref in semrefs:
ktype = semref.knowledge.knowledge_type
if ktype == "entity" :
entity_count += 1
entity = semref.knowledge
print ( f "Entity: { entity.name } (types: { entity.type } )" )
elif ktype == "action" :
action_count += 1
action = semref.knowledge
print ( f "Action: { action.verbs } - { action.subject_entity_name } -> { action.object_entity_name } " )
elif ktype == "topic" :
topic_count += 1
topic = semref.knowledge
print ( f "Topic: { topic.text } " )
print ( f " \n Summary: { entity_count } entities, { action_count } actions, { topic_count } topics" )
Example: Finding Knowledge by Range
from typeagent.knowpro.interfaces import TextLocation, TextRange
# Get all semantic refs for a specific message
message_ordinal = 5
size = await semrefs.size()
all_semrefs = await semrefs.get_slice( 0 , size)
# Filter by message
message_knowledge = [
semref for semref in all_semrefs
if semref.range.start.message_ordinal == message_ordinal
]
print ( f "Message { message_ordinal } has { len (message_knowledge) } knowledge items:" )
for semref in message_knowledge:
print ( f " - { semref.knowledge.knowledge_type } : { semref.knowledge } " )
Implementation Classes
MemoryMessageCollection
from typeagent.storage.memory import MemoryMessageCollection
In-memory implementation of IMessageCollection. Fast, non-persistent.
MemorySemanticRefCollection
from typeagent.storage.memory import MemorySemanticRefCollection
In-memory implementation of ISemanticRefCollection. Fast, non-persistent.
SqliteMessageCollection
from typeagent.storage.sqlite import SqliteMessageCollection
SQLite-backed implementation of IMessageCollection. Persistent, transactional.
SqliteSemanticRefCollection
from typeagent.storage.sqlite import SqliteSemanticRefCollection
SQLite-backed implementation of ISemanticRefCollection. Persistent, transactional.
Best Practice: Access collections through the storage provider or conversation object rather than instantiating directly.
Type Aliases
type MessageOrdinal = int
type SemanticRefOrdinal = int
Ordinal numbers are 0-based sequential integers used to reference items in collections.
Batch Operations Use extend() instead of multiple append() calls for better performance, especially with SQLite.
Range Queries Use get_slice() for contiguous ranges rather than get_multiple() with sequential ordinals.
Async Iteration For large collections, async iteration is more memory-efficient than loading all items.
Size Checks Cache the result of size() if you need it multiple times in a transaction.