Self-Improvement

Self-improvement allows agents to analyze feedback patterns and propose updates to their own system instructions, creating a continuous improvement loop.

Self-improvement is currently in Phase 3 development. The architecture is designed but not yet fully implemented.

How It Works

Self-improvement builds on the learning system’s feedback and decision logging:

Collect Feedback

Track user feedback signals:

Thumbs up/down on responses
Regeneration requests
Explicit corrections
Behavioral signals (task completion, user satisfaction)

Identify Patterns

Analyze feedback to find improvement opportunities:

Recurring failures or errors
Consistent user corrections
Successful interaction patterns

Propose Changes

Generate instruction updates based on patterns:

Add clarifications for common mistakes
Update response style based on user preferences
Refine tool usage guidelines

Human Review

All changes require human approval (HITL mode):

from agno.learn import SelfImprovementConfig, LearningMode

config = SelfImprovementConfig(
    mode=LearningMode.HITL,  # Human approval required
)

Configuration (Planned)

When available, self-improvement will be configured as part of LearningMachine:

from agno.agent import Agent
from agno.learn import (
    LearningMachine,
    SelfImprovementConfig,
    FeedbackConfig,
    LearningMode,
)
from agno.db.sqlite import SqliteDb

agent = Agent(
    name="Self-Improving Agent",
    model=OpenAIResponses(id="gpt-5.2"),
    db=SqliteDb(db_file="tmp/agent.db"),
    learning=LearningMachine(
        # Collect feedback signals
        feedback=FeedbackConfig(
            mode=LearningMode.ALWAYS,
        ),
        # Propose improvements with human approval
        self_improvement=SelfImprovementConfig(
            mode=LearningMode.HITL,
        ),
    ),
)

Feedback Types

The system will track multiple feedback signals:

Explicit Feedback

# User provides direct feedback
response = agent.run("Write a product description")

# Thumbs down - indicates poor response
agent.record_feedback(
    run_id=response.run_id,
    rating="negative",
    reason="Too verbose, needs to be more concise",
)

Implicit Feedback

# Behavioral signals
feedback_signals = {
    "regenerated": True,        # User asked for regeneration
    "edited_output": True,      # User edited the response
    "task_completed": False,    # Task was not successful
}

Correction Feedback

# User provides a correction
agent.record_feedback(
    run_id=response.run_id,
    rating="negative",
    correction="Product descriptions should focus on benefits, not features",
)

Improvement Workflow

# Track feedback over time
for interaction in user_interactions:
    response = agent.run(interaction.message)
    
    # Record feedback
    agent.record_feedback(
        run_id=response.run_id,
        rating=interaction.user_rating,
        reason=interaction.feedback_reason,
    )

Safety Considerations

Self-improvement requires careful safeguards:

Human-in-the-Loop: All changes require human approval
Version Control: Track all instruction versions
Rollback Capability: Ability to revert changes
Scope Limits: Prevent agents from removing safety guidelines
Audit Trail: Complete log of all proposed and applied changes

Architecture Design

The self-improvement configuration follows the same patterns as other learning stores:

from dataclasses import dataclass
from agno.learn.config import LearningMode

@dataclass
class SelfImprovementConfig:
    """Configuration for Self-Improvement learning type.
    
    Self-Improvement proposes updates to agent instructions based
    on feedback patterns and successful interactions.
    
    Scope: AGENT (fixed) - Stored and retrieved by agent_id.
    
    Note: Deferred to Phase 3.
    """
    
    # Required fields
    db: Optional[Union[BaseDb, AsyncBaseDb]] = None
    model: Optional[Model] = None
    
    # Mode and extraction
    mode: LearningMode = LearningMode.HITL  # Always requires human approval
    schema: Optional[Type[Any]] = None
    
    # Prompt customization
    instructions: Optional[str] = None

Benefits

Continuous Improvement

Agents improve over time based on real usage patterns

User-Driven

Changes reflect actual user needs and preferences

Safe Evolution

Human oversight ensures changes are appropriate

Audit Trail

Complete history of all changes and their reasoning

Learning Overview

Understand the full learning system

Decision Logging

Track agent decisions for analysis

Evaluations

Measure agent performance over time

Human Approval

Human-in-the-loop approval workflows

Get Started

Core Building Blocks

Production Runtime

Memory & State

Knowledge & RAG

Advanced Features

Models

Tools & Integrations

Self-Improvement

How It Works

Configuration (Planned)

Feedback Types

Explicit Feedback

Implicit Feedback

Correction Feedback

Improvement Workflow

Safety Considerations

Architecture Design

Benefits

Continuous Improvement

User-Driven

Safe Evolution

Audit Trail

Learning Overview

Decision Logging

Evaluations

Human Approval

Build docs developers (and LLMs) love

Get Started

Core Building Blocks

Production Runtime

Memory & State

Knowledge & RAG

Advanced Features

Models

Tools & Integrations

​How It Works

​Configuration (Planned)

​Feedback Types

​Explicit Feedback

​Implicit Feedback

​Correction Feedback

​Improvement Workflow

​Safety Considerations

​Architecture Design

​Benefits

Continuous Improvement

User-Driven

Safe Evolution

Audit Trail

​Related Topics

Learning Overview

Decision Logging

Evaluations

Human Approval

Build docs developers (and LLMs) love

How It Works

Configuration (Planned)

Feedback Types

Explicit Feedback

Implicit Feedback

Correction Feedback

Improvement Workflow

Safety Considerations

Architecture Design

Benefits

Related Topics