Skip to main content

Overview

The Forge module is the main orchestration layer of Dream Foundry. It manages the complete lifecycle of running multiple candidate implementations, scoring them across multiple criteria, and selecting the winner.

Core Function

run_forge()

Runs the complete forge loop with all candidates.
objective
str
required
The goal for candidates to implement. This is the task description that all candidate agents will attempt to solve.Example: "Generate weekly AI events for Discord"
use_daytona
bool
default:"False"
If True, run candidates in Daytona sandboxes for isolation. Otherwise, runs locally via subprocess.Note: Requires DAYTONA_API_KEY environment variable to be set. Falls back to local execution if Daytona is not configured.
log_callback
Callable[[str, str], None]
Optional callback function for streaming logs during execution.Signature: callback(candidate_id: str, message: str) -> NoneThis allows real-time monitoring of candidate execution progress.
Returns: ForgeResult - Complete result object with all scoring data and winner information
from forge import run_forge

# Run forge with default settings (local execution)
result = run_forge(
    objective="Generate weekly AI events for Discord"
)

print(f"Winner: {result.winner_id}")
print(f"Score: {result.winner_score}")
print(f"Total candidates: {len(result.candidates)}")

Data Classes

ForgeResult

Complete result of a forge run containing all execution data, scores, and winner information.
objective
str
The objective that was executed
timestamp
str
ISO timestamp when the forge run completed (e.g., "2026-01-28T14:30:00.123456")
execution_mode
str
Execution mode used: "local" or "daytona"
candidates
list
List of all candidate information including their artifacts. Each candidate contains:
  • id: Candidate identifier
  • name: Human-readable name
  • description: Description of approach
  • script: Path to implementation script
  • artifact: Generated output content (if any)
scores
list
List of score dictionaries for each candidate containing:
  • candidate_id: Identifier
  • success: Boolean indicating if execution succeeded
  • quality_score: Quality score (0-100)
  • speed_score: Speed score (0-100)
  • total_score: Weighted total score
  • quality_details: Detailed quality validation results
winner_id
str
ID of the winning candidate (highest total score)
winner_score
float
Total score of the winner
winner_artifact
str | None
The artifact content produced by the winner, or None if no artifact was generated
artifacts_dir
str
default:"artifacts"
Directory path where all artifacts are stored
from forge import run_forge

result = run_forge(objective="Generate weekly AI events for Discord")

# Access winner information
print(f"Winner: {result.winner_id}")
print(f"Score: {result.winner_score}")

# Access winner's artifact
if result.winner_artifact:
    print("\nWinner's Output:")
    print(result.winner_artifact)

# Access all scores
for score in result.scores:
    print(f"{score['candidate_id']}: {score['total_score']}")

Constants

CANDIDATES

List of all available candidate implementations. Each candidate has a unique approach and tradeoffs.
CANDIDATES = [
    {
        "id": "alpha",
        "name": "Agent Alpha",
        "description": "The Speed Demon - Fast scraper, may miss JS content",
        "script": "candidates/agent_alpha.py",
    },
    {
        "id": "beta",
        "name": "Agent Beta",
        "description": "The Perfectionist - Thorough but slow",
        "script": "candidates/agent_beta.py",
    },
    {
        "id": "gamma",
        "name": "Agent Gamma",
        "description": "The Insider - API-based, reliable",
        "script": "candidates/agent_gamma.py",
    },
    {
        "id": "delta",
        "name": "Agent Delta",
        "description": "The Crasher - Unstable, crashes frequently",
        "script": "candidates/agent_delta.py",
    },
    {
        "id": "epsilon",
        "name": "Agent Epsilon",
        "description": "The Hallucinator - Fast but unreliable data",
        "script": "candidates/agent_epsilon.py",
    },
]

Helper Functions

run_candidate_local()

Runs a single candidate locally via subprocess. This is an internal function used by run_forge().
candidate
dict
required
Candidate dictionary from CANDIDATES list
objective
str
required
The objective to execute
artifacts_dir
Path
required
Directory to store output artifacts
log_callback
Callable[[str, str], None]
Optional logging callback
Returns: Tuple of (produced_output, error_occurred, runtime_seconds, artifact_content, error_message)
This function has a 60-second timeout per candidate. If execution exceeds this, a timeout error is returned.

Scoring Weights

The forge uses weighted scoring across three criteria:
  • Success (20%): Did the candidate produce output without errors?
  • Quality (60%): Does the output meet requirements?
    • 10+ events (2 per day Mon-Fri)
    • Daytona AI Hackathon must be included
    • Valid URLs (lu.ma, meetup, eventbrite)
    • AI-related content
    • Correct event types (hackathon, meetup, conference)
  • Speed (20%): Execution time (faster = better, max 30 seconds)
Total Score Formula:
total_score = (success_score × 0.2) + (quality_score × 0.6) + (speed_score × 0.2)
If a candidate fails (doesn’t produce output), the total score is 0 regardless of other metrics.

Artifacts

The forge produces several artifacts in the artifacts/ directory:
artifacts/
├── alpha/
│   └── discord_post.md    # Agent Alpha's output
├── beta/
│   └── discord_post.md    # Agent Beta's output
├── gamma/
│   └── discord_post.md    # Agent Gamma's output
├── delta/
│   └── discord_post.md    # Agent Delta's output
├── epsilon/
│   └── discord_post.md    # Agent Epsilon's output
├── scores.json            # Complete scoring data
└── winner.txt             # Winner's candidate ID

scores.json Structure

{
  "objective": "Generate weekly AI events for Discord",
  "timestamp": "2026-01-28T14:30:00.123456",
  "execution_mode": "local",
  "candidates": [
    {
      "candidate_id": "alpha",
      "success": true,
      "quality_score": 85.0,
      "speed_score": 95.0,
      "total_score": 88.0,
      "quality_details": {
        "passed": true,
        "total_events": 14,
        "valid_events": 12,
        "has_daytona_event": true,
        "details": "All 12 events passed strict validation!"
      }
    }
  ],
  "winner": "gamma",
  "winner_score": 92.5
}

Error Handling

The forge captures and reports errors comprehensively:
  • Subprocess errors: Captured via stderr, limited to 500 characters
  • Timeouts: 60-second limit per candidate
  • Sentry integration: Errors automatically sent to Sentry (if configured)
  • Missing output: Detected and scored as failure
from forge import run_forge

result = run_forge(objective="Generate weekly AI events for Discord")

# Check for failures
for score in result.scores:
    if not score['success']:
        print(f"{score['candidate_id']} failed:")
        if score.get('error_message'):
            print(f"  Error: {score['error_message']}")

Integration with Sentry

Optional Sentry integration for error monitoring:
  1. Set SENTRY_DSN environment variable
  2. Install sentry-sdk: pip install sentry-sdk
  3. Errors are automatically captured during execution
import os
os.environ['SENTRY_DSN'] = 'https://[email protected]/project'

from forge import run_forge

# Errors will be sent to Sentry
result = run_forge(objective="Generate weekly AI events for Discord")
Sentry is completely optional. The forge works without it, but provides enhanced error tracking when configured.

CLI Usage

The forge can also be run from the command line:
# Basic usage
python forge.py "Generate weekly AI events for Discord"

# With flag syntax
python forge.py --objective "Weekly SF AI events"

# Run in Daytona sandboxes
python forge.py --daytona "Generate weekly AI events for Discord"

# Short flag
python forge.py -d -o "Weekly SF AI events"
Default objective is used if none provided:
python forge.py
# Uses: "Generate weekly 'AI Events in San Francisco' post for Discord"

Build docs developers (and LLMs) love