Architecture - AgentOS

AgentOS is built on a polyglot architecture where components written in Rust, TypeScript, and Python communicate through the iii-engine bus. Every component is a worker that registers functions - no frameworks, no vendor lock-in.

Architecture Overview

┌──────────────────────────────────────────────────────────────┐
│                        iii-engine                            │
│              Worker · Function · Trigger                     │
├──────────┬───────────┬───────────┬───────────┬───────────────┤
│ agent    │ security  │    llm    │  memory   │     wasm      │
│ core     │  rbac     │  router   │  store    │    sandbox    │
│ workflow │  audit    │ 25 LLMs   │  session  │   (wasmtime)  │
│ api      │  taint    │  catalog  │  recall   │    (Rust)     │
│ hand     │  sign     │   (Rust)  │  (Rust)   │               │
│ (Rust)   │  (Rust)   │           │           │               │
├──────────┴───────────┴───────────┴───────────┴───────────────┤
│                   Control Plane (Rust)                       │
│  realm · hierarchy · directive · mission · ledger            │
│  council · pulse · bridge (8 crates, 45 functions)           │
├──────────────────────────────────────────────────────────────┤
│  api · workflows · tools(60+) · skills · channels · hooks    │
│  approval · streaming · mcp · a2a · vault · browser · swarm  │
│  knowledge-graph · session-replay · skillkit · tool-profiles │
│                      (TypeScript)                            │
├──────────────────────────────────────────────────────────────┤
│                    embedding (Python)                        │
├──────────────────────────────────────────────────────────────┤
│     CLI (Rust)              TUI (Rust/ratatui)               │
└──────────────────────────────────────────────────────────────┘

Design Principles

Polyglot by Design

Use the right language for each task - Rust for performance, TypeScript for iteration speed, Python for ML.

No Frameworks

Every capability is a plain function. No magic, no vendor lock-in.

Trigger-Based Communication

Components call each other via trigger(). Language doesn’t matter.

Hot-Swappable

Replace any component without touching others. Just register new functions.

The iii-engine Bus

At the center is the iii-engine - a WebSocket-based message bus that:

Accepts connections from workers (any language)
Stores function registry (all registered functions)
Routes invocations to the correct worker
Manages modules (state, queue, pubsub, cron, HTTP API)

Every component connects to the iii-engine over WebSocket at ws://localhost:49134.

iii-engine Modules

The engine provides core services via modules:

# From config.yaml:1-68
port: 49134

modules:
  # HTTP REST API (port 3111)
  - class: modules::api::RestApiModule
    config:
      port: 3111
      host: 0.0.0.0
      concurrency_request_limit: 2048

  # State storage (file-based KV store)
  - class: modules::state::StateModule
    config:
      adapter:
        class: modules::state::adapters::KvStore
        config:
          store_method: file_based
          file_path: ./data/state

  # WebSocket streams (port 3112)
  - class: modules::stream::StreamModule
    config:
      port: 3112

  # Task queue
  - class: modules::queue::QueueModule
    config:
      adapter:
        class: modules::queue::BuiltinQueueAdapter

  # Pub/Sub messaging
  - class: modules::pubsub::PubSubModule
    config:
      adapter:
        class: modules::pubsub::LocalAdapter

  # Cron scheduler
  - class: modules::cron::CronModule
    config:
      adapter:
        class: modules::cron::KvCronAdapter

  # Key-value store
  - class: modules::kv_server::KvServer
    config:
      store_method: file_based
      file_path: ./data/kv
      save_interval_ms: 5000

  # Observability (OpenTelemetry)
  - class: modules::observability::OtelModule
    config:
      enabled: true
      exporter: memory
      metrics_enabled: true

Architecture Layers

Layer 1: Hot Path (Rust)

Performance-critical operations run in Rust:

Crate	LOC	Purpose	Key Functions
`agent-core`	320	ReAct agent loop	`agent::chat`, `agent::create`, `agent::list_tools`
`memory`	840	Session/episodic memory	`memory::store`, `memory::recall`, `memory::consolidate`
`llm-router`	320	25 LLM providers	`llm::route`, `llm::complete`
`security`	700	RBAC, audit, taint	`security::check_capability`, `security::scan_injection`
`wasm-sandbox`	180	WASM execution	`wasm::execute`

Why Rust? Low latency, high throughput, memory safety. The hot path handles every agent invocation.

Layer 2: Control Plane (Rust)

Multi-tenant orchestration layer:

Crate	LOC	Purpose	Endpoints
`realm`	280	Multi-tenant isolation	7 REST
`hierarchy`	250	Agent org structure	5 REST
`directive`	280	Goal alignment	5 REST
`mission`	350	Task lifecycle	7 REST
`ledger`	300	Budget enforcement	4 REST + 1 PubSub
`council`	450	Governance	6 REST + 1 PubSub
`pulse`	250	Scheduled invocation	4 REST
`bridge`	300	External runtimes	5 REST

Why Rust? Reliability, strong typing, predictable performance for orchestration.

Layer 3: Application (TypeScript)

Rapid iteration and integrations:

Worker	Purpose	Key Functions
`api.ts`	OpenAI-compatible API	`api::chat_completions`
`agent-core.ts`	TS agent loop	`agent::chat`
`tools.ts`	22 built-in tools	`file::read`, `web::search`, `shell::exec`
`tools-extended.ts`	38 extended tools	`schedule::`, `media::`, `data::*`
`swarm.ts`	Multi-agent swarms	`swarm::create`, `swarm::coordinate`
`knowledge-graph.ts`	Entity-relation graph	`kg::add`, `kg::query`, `kg::visualize`
`session-replay.ts`	Session recording	`replay::record`, `replay::get`, `replay::summary`
`vault.ts`	Encrypted secrets	`vault::set`, `vault::get`, `vault::list`
`browser.ts`	Headless browser	`browser::navigate`, `browser::screenshot`
`channels/*.ts`	40 channel adapters	Slack, Discord, Telegram, WhatsApp, etc.

Why TypeScript? Fast iteration, rich ecosystem, excellent tooling.

Layer 4: ML (Python)

Machine learning workloads:

# From workers/embedding/main.py
iii = III("ws://localhost:49134", worker_name="embedding")

@iii.function(id="embedding::generate", description="Generate text embeddings")
async def generate_embedding(input):
    text = input.get("text", "")
    model = SentenceTransformer("all-MiniLM-L6-v2")
    embedding = model.encode([text], normalize_embeddings=True)[0]
    return {"embedding": embedding.tolist(), "dim": len(embedding)}

Why Python? Best ML ecosystem (transformers, sentence-transformers, numpy).

Communication Flow

All components communicate via trigger(), regardless of language:

Real Code Example: Cross-Language Calls

Here’s actual code showing Rust calling TypeScript calling Python:

// From crates/agent-core/src/main.rs:199-213
// Rust agent calls TypeScript tool
for tc in &calls {
    // tc.id might be "web::search" (TypeScript)
    // or "embedding::generate" (Python)
    match iii.trigger(&tc.id, tc.arguments.clone()).await {
        Ok(result) => {
            tool_results.push(json!({
                "toolCallId": tc.call_id,
                "output": result,
            }));
        }
        Err(e) => {
            tool_results.push(json!({
                "toolCallId": tc.call_id,
                "output": { "error": e.to_string() },
            }));
        }
    }
}

Complete Request Flow

Here’s what happens when a user sends a chat message:

HTTP Request

Client sends POST /v1/chat/completions to port 3111

{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}

API Worker (TypeScript)

api.ts receives HTTP trigger, validates request, calls agent::chat

const response = await trigger("agent::chat", {
  agentId: "default",
  message: "Hello"
});

Agent Core (Rust)

agent-core orchestrates the request:

// Security scan
let scan = iii.trigger("security::scan_injection", json!({"text": msg})).await?;

// Recall memories
let memories = iii.trigger("memory::recall", json!({"agentId": id})).await?;

// Route to model
let model = iii.trigger("llm::route", json!({"message": msg})).await?;

// Call LLM
let response = iii.trigger("llm::complete", json!({model, messages})).await?;

Tool Execution (TypeScript/Python)

If LLM requests tools, agent calls them:

for tool_call in response.tool_calls {
    let result = iii.trigger(&tool_call.id, tool_call.args).await?;
}

Tools might be TypeScript (web::search) or Python (embedding::generate).

Memory Storage (Rust)

Store conversation in memory:

iii.trigger_void("memory::store", json!({
    "agentId": id,
    "role": "assistant",
    "content": response.content
}));

HTTP Response

api.ts returns OpenAI-compatible response:

{
  "choices": [{"message": {"role": "assistant", "content": "Hi there!"}}],
  "usage": {"prompt_tokens": 10, "completion_tokens": 5}
}

Scalability

Horizontal Scaling

Multiple Workers

Run multiple instances of the same worker for load balancing

Language Independence

Scale TypeScript workers separately from Rust workers

Queue-Based

Use queue triggers for task distribution across workers

Stateless Workers

All state in iii-engine modules, workers are ephemeral

Starting Workers

Production
Development

# CLI manages all workers
agentos start

# Start iii-engine
iii --config config.yaml

# Start Rust workers
cargo run --release -p agentos-core &
cargo run --release -p agentos-security &
cargo run --release -p agentos-memory &
cargo run --release -p agentos-llm-router &

# Start TypeScript workers
npx tsx src/api.ts &
npx tsx src/agent-core.ts &
npx tsx src/tools.ts &

# Start Python workers
python workers/embedding/main.py &

Project Structure

agentos/
├── config.yaml              # iii-engine configuration
├── Cargo.toml               # Rust workspace
├── package.json             # Node.js dependencies
│
├── crates/                  # Rust workers (18 crates)
│   ├── agent-core/          # ReAct agent loop (Rust)
│   ├── security/            # RBAC, audit, taint (Rust)
│   ├── memory/              # Session memory (Rust)
│   ├── llm-router/          # 25 LLM providers (Rust)
│   ├── wasm-sandbox/        # WASM execution (Rust)
│   ├── realm/               # Multi-tenant isolation
│   ├── mission/             # Task lifecycle
│   └── ...                  # 11 more Rust crates
│
├── src/                     # TypeScript workers (39 files)
│   ├── api.ts               # OpenAI API (TypeScript)
│   ├── agent-core.ts        # Agent loop (TypeScript)
│   ├── tools.ts             # 22 built-in tools
│   ├── tools-extended.ts    # 38 extended tools
│   ├── swarm.ts             # Multi-agent swarms
│   ├── knowledge-graph.ts   # Entity-relation graph
│   ├── session-replay.ts    # Session recording
│   ├── channels/            # 40 channel adapters
│   └── ...                  # 30+ more TypeScript workers
│
├── workers/                 # Python workers
│   └── embedding/
│       └── main.py          # Text embeddings (Python)
│
├── agents/                  # 45 agent templates
├── hands/                   # 7 autonomous hands
└── integrations/            # 25 MCP integrations

Technology Stack

Languages & Frameworks

Layer	Language	Runtime	Key Libraries
Hot Path	Rust	Native	tokio, serde, iii-sdk
Control Plane	Rust	Native	tokio, serde, iii-sdk
Application	TypeScript	Node.js 20+	iii-sdk, tsx
ML	Python	3.11+	iii-sdk, sentence-transformers

iii-engine

Component	Technology
Message Bus	WebSocket (port 49134)
HTTP API	REST (port 3111)
Streams	WebSocket (port 3112)
State	File-based KV store
Queue	Built-in adapter
PubSub	Local adapter
Cron	KV-based scheduler

Testing

AgentOS has 2,506 tests across all languages:

# TypeScript tests (1,439 tests)
npx vitest --run

# Rust tests (906 tests)
cargo test --workspace

# Python tests (161 tests)
python3 -m pytest

Test Coverage by Layer

Layer	Tests	Files
TypeScript	1,439	48
Rust	906	10 crates
Python	161	3

Observability

All workers report metrics to the observability module:

# From config.yaml:54-68
modules:
  - class: modules::observability::OtelModule
    config:
      enabled: true
      exporter: memory
      metrics_enabled: true
      logs_enabled: true
      alerts:
        - name: high-error-rate
          metric: iii.invocations.error
          threshold: 10
          operator: ">"
          window_seconds: 60
          action:
            type: log

Key Metrics

iii.invocations.total - Total function invocations
iii.invocations.error - Failed invocations
function_call_duration_ms - Latency histogram
tokens_used_total - LLM token usage
active_sessions - Concurrent chat sessions

Benefits of This Architecture

Language Flexibility

Use Rust for hot path, TypeScript for APIs, Python for ML - all seamlessly integrated

No Vendor Lock-in

Every function is plain code. Switch components without rewriting the system.

Hot-Swappable

Replace any worker at runtime. Register new functions without downtime.

Testable

Every function can be tested independently. 2,506 tests prove it works.

Scalable

Scale individual workers based on load. Add more workers dynamically.

Observable

OpenTelemetry metrics, alerts, and logs for full system visibility.

Next Steps

Workers

Learn how to create workers in Rust, TypeScript, and Python

Functions

Deep dive into function registration and invocation

Triggers

Explore HTTP, queue, cron, and pubsub triggers

Get Started

Core Concepts

CLI Reference

Tools & Capabilities

Control Plane

Security

Advanced Features

Templates & Examples

​Architecture Overview

​Design Principles

Polyglot by Design

No Frameworks

Trigger-Based Communication

Hot-Swappable

​The iii-engine Bus

​iii-engine Modules

​Architecture Layers

​Layer 1: Hot Path (Rust)

​Layer 2: Control Plane (Rust)

​Layer 3: Application (TypeScript)

​Layer 4: ML (Python)

​Communication Flow

​Real Code Example: Cross-Language Calls

​Complete Request Flow

​Scalability

​Horizontal Scaling

Multiple Workers

Language Independence

Queue-Based

Stateless Workers

​Starting Workers

​Project Structure

​Technology Stack

​Languages & Frameworks

​iii-engine

​Testing

​Test Coverage by Layer

​Observability

​Key Metrics

​Benefits of This Architecture

Language Flexibility

No Vendor Lock-in

Hot-Swappable

Testable

Scalable

Observable

​Next Steps

Workers

Functions

Triggers

Build docs developers (and LLMs) love

Architecture Overview

Design Principles

The iii-engine Bus

iii-engine Modules

Architecture Layers

Layer 1: Hot Path (Rust)

Layer 2: Control Plane (Rust)

Layer 3: Application (TypeScript)

Layer 4: ML (Python)

Communication Flow

Real Code Example: Cross-Language Calls

Complete Request Flow

Scalability

Horizontal Scaling

Starting Workers

Project Structure

Technology Stack

Languages & Frameworks

iii-engine

Testing

Test Coverage by Layer

Observability

Key Metrics

Benefits of This Architecture

Next Steps