Skip to main content
Self-improvement allows agents to analyze feedback patterns and propose updates to their own system instructions, creating a continuous improvement loop.
Self-improvement is currently in Phase 3 development. The architecture is designed but not yet fully implemented.

How It Works

Self-improvement builds on the learning system’s feedback and decision logging:
1

Collect Feedback

Track user feedback signals:
  • Thumbs up/down on responses
  • Regeneration requests
  • Explicit corrections
  • Behavioral signals (task completion, user satisfaction)
2

Identify Patterns

Analyze feedback to find improvement opportunities:
  • Recurring failures or errors
  • Consistent user corrections
  • Successful interaction patterns
3

Propose Changes

Generate instruction updates based on patterns:
  • Add clarifications for common mistakes
  • Update response style based on user preferences
  • Refine tool usage guidelines
4

Human Review

All changes require human approval (HITL mode):
from agno.learn import SelfImprovementConfig, LearningMode

config = SelfImprovementConfig(
    mode=LearningMode.HITL,  # Human approval required
)

Configuration (Planned)

When available, self-improvement will be configured as part of LearningMachine:
from agno.agent import Agent
from agno.learn import (
    LearningMachine,
    SelfImprovementConfig,
    FeedbackConfig,
    LearningMode,
)
from agno.db.sqlite import SqliteDb

agent = Agent(
    name="Self-Improving Agent",
    model=OpenAIResponses(id="gpt-5.2"),
    db=SqliteDb(db_file="tmp/agent.db"),
    learning=LearningMachine(
        # Collect feedback signals
        feedback=FeedbackConfig(
            mode=LearningMode.ALWAYS,
        ),
        # Propose improvements with human approval
        self_improvement=SelfImprovementConfig(
            mode=LearningMode.HITL,
        ),
    ),
)

Feedback Types

The system will track multiple feedback signals:

Explicit Feedback

# User provides direct feedback
response = agent.run("Write a product description")

# Thumbs down - indicates poor response
agent.record_feedback(
    run_id=response.run_id,
    rating="negative",
    reason="Too verbose, needs to be more concise",
)

Implicit Feedback

# Behavioral signals
feedback_signals = {
    "regenerated": True,        # User asked for regeneration
    "edited_output": True,      # User edited the response
    "task_completed": False,    # Task was not successful
}

Correction Feedback

# User provides a correction
agent.record_feedback(
    run_id=response.run_id,
    rating="negative",
    correction="Product descriptions should focus on benefits, not features",
)

Improvement Workflow

# Track feedback over time
for interaction in user_interactions:
    response = agent.run(interaction.message)
    
    # Record feedback
    agent.record_feedback(
        run_id=response.run_id,
        rating=interaction.user_rating,
        reason=interaction.feedback_reason,
    )

Safety Considerations

Self-improvement requires careful safeguards:
  1. Human-in-the-Loop: All changes require human approval
  2. Version Control: Track all instruction versions
  3. Rollback Capability: Ability to revert changes
  4. Scope Limits: Prevent agents from removing safety guidelines
  5. Audit Trail: Complete log of all proposed and applied changes

Architecture Design

The self-improvement configuration follows the same patterns as other learning stores:
from dataclasses import dataclass
from agno.learn.config import LearningMode

@dataclass
class SelfImprovementConfig:
    """Configuration for Self-Improvement learning type.
    
    Self-Improvement proposes updates to agent instructions based
    on feedback patterns and successful interactions.
    
    Scope: AGENT (fixed) - Stored and retrieved by agent_id.
    
    Note: Deferred to Phase 3.
    """
    
    # Required fields
    db: Optional[Union[BaseDb, AsyncBaseDb]] = None
    model: Optional[Model] = None
    
    # Mode and extraction
    mode: LearningMode = LearningMode.HITL  # Always requires human approval
    schema: Optional[Type[Any]] = None
    
    # Prompt customization
    instructions: Optional[str] = None

Benefits

Continuous Improvement

Agents improve over time based on real usage patterns

User-Driven

Changes reflect actual user needs and preferences

Safe Evolution

Human oversight ensures changes are appropriate

Audit Trail

Complete history of all changes and their reasoning

Learning Overview

Understand the full learning system

Decision Logging

Track agent decisions for analysis

Evaluations

Measure agent performance over time

Human Approval

Human-in-the-loop approval workflows

Build docs developers (and LLMs) love