Collections

Overview

Collections are append-only data structures for storing messages and semantic references. They provide both iteration and random access by ordinal number.

Base Protocols

IReadonlyCollection

class IReadonlyCollection[T, TOrdinal](AsyncIterable[T], Protocol)

Base protocol for read-only access to ordered collections.

size

async def size(self) -> int

Get the total number of items in the collection.

count

int

Number of items currently in the collection.

get_item

async def get_item(self, arg: TOrdinal) -> T

Retrieve a single item by its ordinal number (0-based index).

arg

TOrdinal

required

The ordinal/index of the item to retrieve.

item

The item at the specified ordinal.

Raises: IndexError if ordinal is out of range.

get_slice

async def get_slice(self, start: int, stop: int) -> list[T]

Retrieve a range of items by ordinal (Python slice semantics).

start

int

required

Starting ordinal (inclusive).

stop

int

required

Ending ordinal (exclusive).

items

list[T]

List of items in the specified range.

Example:

# Get first 10 items
items = await collection.get_slice(0, 10)

# Get items 100-109
items = await collection.get_slice(100, 110)

# Get all items
size = await collection.size()
items = await collection.get_slice(0, size)

get_multiple

async def get_multiple(self, arg: list[TOrdinal]) -> list[T]

Retrieve multiple items by their ordinals.

arg

list[TOrdinal]

required

List of ordinals to retrieve.

items

list[T]

List of items in the same order as the input ordinals.

Example:

# Get specific messages
messages = await collection.get_multiple([0, 5, 10, 15])

Async Iteration

Collections support async iteration:

async for item in collection:
    print(item)

ICollection

class ICollection[T, TOrdinal](IReadonlyCollection[T, TOrdinal], Protocol)

Extends IReadonlyCollection with append operations. Collections are append-only - no deletion or modification.

is_persistent

@property
def is_persistent(self) -> bool

Indicates whether the collection persists across process restarts.

persistent

bool

True for SQLite storage
False for in-memory storage

append

async def append(self, item: T) -> None

Append a single item to the collection.

item

required

The item to append.

Example:

msg = ConversationMessage(
    text_chunks=["Hello world"],
    metadata=ConversationMessageMeta(speaker="Alice")
)
await messages.append(msg)

extend

async def extend(self, items: Iterable[T]) -> None

Append multiple items to the collection.

items

Iterable[T]

required

The items to append.

Default Implementation: Calls append() for each item. SQLite implementations override for batch efficiency. Example:

messages = [
    ConversationMessage(text_chunks=["Hello"], metadata=meta1),
    ConversationMessage(text_chunks=["World"], metadata=meta2),
    ConversationMessage(text_chunks=["!"], metadata=meta3),
]
await collection.extend(messages)

IMessageCollection

class IMessageCollection[TMessage: IMessage](
    ICollection[TMessage, MessageOrdinal],
    Protocol
)

Collection interface for conversation messages. Messages are identified by ordinal numbers (MessageOrdinal = int).

Type Parameters

TMessage

IMessage

The message type (e.g., ConversationMessage, TranscriptMessage).

Usage

from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage

conv = await create_conversation(
    dbname="chat.db",
    message_type=ConversationMessage
)

messages: IMessageCollection[ConversationMessage] = conv.messages

# Get collection size
count = await messages.size()
print(f"Total messages: {count}")

# Get first message
if count > 0:
    first_msg = await messages.get_item(0)
    print(f"First message: {first_msg.text_chunks[0]}")

# Get recent messages
recent = await messages.get_slice(max(0, count - 10), count)
for msg in recent:
    print(f"{msg.metadata.speaker}: {msg.text_chunks[0]}")

# Iterate all messages
async for msg in messages:
    print(f"[{msg.timestamp}] {msg.text_chunks[0]}")

Complete Example

from typeagent import create_conversation
from typeagent.knowpro.universal_message import (
    ConversationMessage,
    ConversationMessageMeta
)

# Create conversation
conv = await create_conversation(
    dbname="example.db",
    message_type=ConversationMessage
)

# Access message collection
messages = conv.messages

# Add single message
msg = ConversationMessage(
    text_chunks=["Let's discuss the roadmap."],
    metadata=ConversationMessageMeta(speaker="Alice"),
    tags=["planning"]
)
await messages.append(msg)

# Add multiple messages
new_messages = [
    ConversationMessage(
        text_chunks=["I think we should prioritize performance."],
        metadata=ConversationMessageMeta(speaker="Bob")
    ),
    ConversationMessage(
        text_chunks=["Agreed. Let's profile the hot paths first."],
        metadata=ConversationMessageMeta(speaker="Alice")
    )
]
await messages.extend(new_messages)

# Query by ordinal
msg_0 = await messages.get_item(0)
print(f"Message 0: {msg_0.text_chunks[0]}")

# Get slice
first_three = await messages.get_slice(0, 3)
for i, msg in enumerate(first_three):
    speaker = msg.metadata.speaker or "Unknown"
    print(f"{i}: [{speaker}] {msg.text_chunks[0]}")

# Get specific messages
selected = await messages.get_multiple([0, 2])
print(f"Got {len(selected)} messages")

ISemanticRefCollection

class ISemanticRefCollection(
    ICollection[SemanticRef, SemanticRefOrdinal],
    Protocol
)

Collection interface for semantic references (extracted knowledge). Semantic references are identified by ordinal numbers (SemanticRefOrdinal = int).

What are Semantic References?

Semantic references link text locations to extracted knowledge:

Entities - Named entities (people, places, things)
Actions - Verb phrases with subject/object
Topics - Subject matter categories
Tags - User-defined labels

Each semantic reference contains:

Ordinal - Unique sequential ID
Range - Text location (message and chunk ordinals)
Knowledge - The extracted knowledge object

Usage

from typeagent import create_conversation
from typeagent.knowpro.universal_message import ConversationMessage

conv = await create_conversation(
    dbname="chat.db",
    message_type=ConversationMessage
)

semrefs = conv.semantic_refs

# Get collection size
count = await semrefs.size()
print(f"Total semantic references: {count}")

# Get specific semantic reference
if count > 0:
    semref = await semrefs.get_item(0)
    print(f"Knowledge type: {semref.knowledge.knowledge_type}")
    print(f"Text range: {semref.range}")
    
    # Access knowledge details
    if hasattr(semref.knowledge, 'name'):
        print(f"Entity name: {semref.knowledge.name}")
    elif hasattr(semref.knowledge, 'text'):
        print(f"Topic text: {semref.knowledge.text}")

# Iterate all semantic references
async for semref in semrefs:
    knowledge_type = semref.knowledge.knowledge_type
    print(f"[{semref.semantic_ref_ordinal}] {knowledge_type}: {semref.knowledge}")

Example: Exploring Extracted Knowledge

from typeagent import create_conversation
from typeagent.knowpro.universal_message import (
    ConversationMessage,
    ConversationMessageMeta
)

# Create and populate conversation
conv = await create_conversation(
    dbname="knowledge_demo.db",
    message_type=ConversationMessage
)

messages = [
    ConversationMessage(
        text_chunks=["Alice visited Paris last summer."],
        metadata=ConversationMessageMeta(speaker="Bob")
    ),
    ConversationMessage(
        text_chunks=["She really enjoyed the Louvre museum."],
        metadata=ConversationMessageMeta(speaker="Bob")
    )
]

# Add messages with knowledge extraction
result = await conv.add_messages_with_indexing(messages)
print(f"Extracted {result.semrefs_added} semantic references")

# Explore extracted knowledge
semrefs = conv.semantic_refs

entity_count = 0
action_count = 0
topic_count = 0

async for semref in semrefs:
    ktype = semref.knowledge.knowledge_type
    
    if ktype == "entity":
        entity_count += 1
        entity = semref.knowledge
        print(f"Entity: {entity.name} (types: {entity.type})")
    
    elif ktype == "action":
        action_count += 1
        action = semref.knowledge
        print(f"Action: {action.verbs} - {action.subject_entity_name} -> {action.object_entity_name}")
    
    elif ktype == "topic":
        topic_count += 1
        topic = semref.knowledge
        print(f"Topic: {topic.text}")

print(f"\nSummary: {entity_count} entities, {action_count} actions, {topic_count} topics")

Example: Finding Knowledge by Range

from typeagent.knowpro.interfaces import TextLocation, TextRange

# Get all semantic refs for a specific message
message_ordinal = 5

size = await semrefs.size()
all_semrefs = await semrefs.get_slice(0, size)

# Filter by message
message_knowledge = [
    semref for semref in all_semrefs
    if semref.range.start.message_ordinal == message_ordinal
]

print(f"Message {message_ordinal} has {len(message_knowledge)} knowledge items:")
for semref in message_knowledge:
    print(f"  - {semref.knowledge.knowledge_type}: {semref.knowledge}")

Implementation Classes

MemoryMessageCollection

from typeagent.storage.memory import MemoryMessageCollection

In-memory implementation of IMessageCollection. Fast, non-persistent.

MemorySemanticRefCollection

from typeagent.storage.memory import MemorySemanticRefCollection

In-memory implementation of ISemanticRefCollection. Fast, non-persistent.

SqliteMessageCollection

from typeagent.storage.sqlite import SqliteMessageCollection

SQLite-backed implementation of IMessageCollection. Persistent, transactional.

SqliteSemanticRefCollection

from typeagent.storage.sqlite import SqliteSemanticRefCollection

SQLite-backed implementation of ISemanticRefCollection. Persistent, transactional.

Best Practice: Access collections through the storage provider or conversation object rather than instantiating directly.

Type Aliases

type MessageOrdinal = int
type SemanticRefOrdinal = int

Ordinal numbers are 0-based sequential integers used to reference items in collections.

Performance Tips

Batch Operations

Use extend() instead of multiple append() calls for better performance, especially with SQLite.

Range Queries

Use get_slice() for contiguous ranges rather than get_multiple() with sequential ordinals.

Async Iteration

For large collections, async iteration is more memory-efficient than loading all items.

Size Checks

Cache the result of size() if you need it multiple times in a transaction.

Storage Providers - Access to collections
ConversationMessage - Message types
SemanticRef - Semantic reference structure
Index Types - Knowledge indexing

Core API

Storage

AI Tools

Settings

Overview

Base Protocols

IReadonlyCollection

size

get_item

get_slice

get_multiple

Async Iteration

ICollection

is_persistent

append

extend

IMessageCollection

Type Parameters

Usage

Complete Example

ISemanticRefCollection

What are Semantic References?

Usage

Example: Exploring Extracted Knowledge

Example: Finding Knowledge by Range

Implementation Classes

MemoryMessageCollection

MemorySemanticRefCollection

SqliteMessageCollection

SqliteSemanticRefCollection

Type Aliases

Performance Tips

Batch Operations

Range Queries

Async Iteration

Size Checks

Build docs developers (and LLMs) love

Core API

Storage

AI Tools

Settings

​Overview

​Base Protocols

​IReadonlyCollection

​size

​get_item

​get_slice

​get_multiple

​Async Iteration

​ICollection

​is_persistent

​append

​extend

​IMessageCollection

​Type Parameters

​Usage

​Complete Example

​ISemanticRefCollection

​What are Semantic References?

​Usage

​Example: Exploring Extracted Knowledge

​Example: Finding Knowledge by Range

​Implementation Classes

​MemoryMessageCollection

​MemorySemanticRefCollection

​SqliteMessageCollection

​SqliteSemanticRefCollection

​Type Aliases

​Performance Tips

Batch Operations

Range Queries

Async Iteration

Size Checks

​Related

Build docs developers (and LLMs) love

Overview

Base Protocols

IReadonlyCollection

size

get_item

get_slice

get_multiple

Async Iteration

ICollection

is_persistent

append

extend

IMessageCollection

Type Parameters

Usage

Complete Example

ISemanticRefCollection

What are Semantic References?

Usage

Example: Exploring Extracted Knowledge

Example: Finding Knowledge by Range

Implementation Classes

MemoryMessageCollection

MemorySemanticRefCollection

SqliteMessageCollection

SqliteSemanticRefCollection

Type Aliases

Performance Tips

Related