The forge loop is the core of Dream Foundry. It runs multiple AI agent implementations in parallel, scores them, and selects the best approach.
Prerequisites
Before running the forge, ensure you have:
Python 3.11+
Required dependencies installed
Optional: Daytona API key for sandbox execution
Optional: Sentry DSN for error monitoring
The forge works both locally and in Daytona sandboxes. Local execution is faster for development, while Daytona provides true isolation.
Command Line Interface
The forge is available through forge.py in the project root.
Basic Usage
Run with default objective
Uses the default objective: “Generate weekly ‘AI Events in San Francisco’ post for Discord”
Run with custom objective
python forge.py "Create a weekly newsletter about AI events"
Or use the --objective flag: python forge.py --objective "Weekly SF AI events"
Run in Daytona sandboxes
python forge.py --daytona
Requires DAYTONA_API_KEY in your environment.
Command Options
Option Short Description --objective-oSpecify the objective for agents to implement --daytona-dRun candidates in Daytona sandboxes --help-hShow help message
Example Output
When you run the forge, you’ll see output like this:
============================================================
DREAM FOUNDRY - Forge Loop
============================================================
Objective: Generate weekly AI events for Discord
Candidates: 5
Execution: LOCAL
PHASE: Running Candidates
----------------------------------------
[Agent Alpha]
[Run] Starting Agent Alpha...
[Done] Output produced
[Done] 4.2s
[Agent Beta]
[Run] Starting Agent Beta...
[Done] Output produced
[Done] 12.8s
[Agent Gamma]
[Run] Starting Agent Gamma...
[Done] Output produced
[Done] 3.1s
============================================================
PHASE: Scoring
----------------------------------------
Candidate Success Quality Speed TOTAL
----------------------------------------------
alpha PASS 45.0 72.0 62.4
beta PASS 88.5 57.3 79.7
gamma PASS 95.2 89.7 92.8
============================================================
WINNER: gamma (score: 92.8 )
============================================================
Streamlit UI
For a visual, step-by-step experience, use the Streamlit interface.
Starting the UI
The UI will open at http://localhost:8501
UI Phases
The Streamlit interface walks through all 5 phases of the Dream Foundry:
1. Dreamcatcher Input your objective and constraints
2. Dream Factory View generated candidate approaches
3. Arena Watch agents compete in real-time with live logs
4. Podium See CodeRabbit polish the top 3 candidates
5. Awakening Experience the winner with ElevenLabs narration
Live Execution Logs
The Arena phase shows real-time logs from each candidate:
# From app.py:304-308
def log_callback ( candidate_id : str , message : str ):
if candidate_id in logs:
logs[candidate_id].append(message)
log_containers[candidate_id].code( " \n " .join(logs[candidate_id][ - 8 :]))
This callback streams execution progress directly to the UI.
Understanding Execution Modes
Local Execution
In local mode (default), candidates run as subprocesses on your machine:
# From forge.py:154-160
result = subprocess.run(
[sys.executable, script_path, objective, str (output_file)],
capture_output = True ,
text = True ,
timeout = 60 ,
cwd = os.path.dirname(os.path.abspath( __file__ )),
)
Advantages:
Fast startup (no sandbox creation)
Easy debugging
No API dependencies
Disadvantages:
No isolation between candidates
Shares system resources
Can’t test different environments
Daytona Execution
With --daytona, each candidate runs in an isolated sandbox:
# From forge.py:241-246
if execution_mode == "daytona" :
daytona_results = run_all_candidates_in_daytona(
CANDIDATES , objective, artifacts_dir, log_callback
)
Advantages:
Complete isolation
Can test different Python versions
Parallel execution
Safer for untrusted code
Disadvantages:
Slower startup (sandbox creation)
Requires API key
Network dependency
Daytona mode automatically falls back to local execution if DAYTONA_API_KEY is not configured.
Scoring System
The forge scores candidates on three criteria:
Success (20%)
Did the candidate produce output without errors?
# From scoring.py:376-377
success = produced_output and not error_occurred
Quality (60%)
Does the output meet requirements?
# From scoring.py:381-383
if artifact_content:
quality = validate_quality(artifact_content)
quality_score = quality.score
Quality validation checks:
Valid events : 10+ events (40 points)
Hackathon included : Must have a hackathon event (25 points)
Event count bonus : 14+ events (20 points)
No invalid events : All events pass validation (15 points)
Speed (20%)
Faster execution = higher score:
# From scoring.py:332-338
def calculate_speed_score ( runtime_seconds : float , max_time : float = 30.0 ) -> float :
if runtime_seconds <= 0 :
return 100.0
if runtime_seconds >= max_time:
return 0.0
return round ( 100 * ( 1 - runtime_seconds / max_time), 1 )
Final Score
# From scoring.py:341-365
def calculate_total_score (
success : bool ,
quality_score : float ,
speed_score : float ,
weights : dict = None ,
) -> float :
if weights is None :
weights = { 'success' : 0.2 , 'quality' : 0.6 , 'speed' : 0.2 }
if not success:
return 0.0
success_score = 100.0
total = (
success_score * weights[ 'success' ] +
quality_score * weights[ 'quality' ] +
speed_score * weights[ 'speed' ]
)
return round (total, 1 )
Artifacts
After a forge run, check the artifacts/ directory:
artifacts/
├── alpha/
│ └── discord_post.md
├── beta/
│ └── discord_post.md
├── gamma/
│ └── discord_post.md
├── scores.json
└── winner.txt
scores.json
Complete scoring breakdown:
{
"objective" : "Generate weekly AI events for Discord" ,
"timestamp" : "2026-01-24T10:30:00" ,
"execution_mode" : "local" ,
"candidates" : [
{
"candidate_id" : "gamma" ,
"success" : true ,
"quality_score" : 95.2 ,
"speed_score" : 89.7 ,
"total_score" : 92.8 ,
"quality_details" : {
"passed" : true ,
"total_events" : 10 ,
"valid_events" : 10 ,
"has_daytona_event" : true
}
}
],
"winner" : "gamma" ,
"winner_score" : 92.8
}
winner.txt
Simple file containing just the winner’s ID:
Programmatic Usage
You can also use the forge programmatically:
from forge import run_forge, ForgeResult
def my_log_callback ( candidate_id : str , message : str ):
print ( f "[ { candidate_id } ] { message } " )
result: ForgeResult = run_forge(
objective = "Create a weekly AI newsletter" ,
use_daytona = True ,
log_callback = my_log_callback
)
print ( f "Winner: { result.winner_id } " )
print ( f "Score: { result.winner_score } " )
print ( f "Artifact: \n { result.winner_artifact } " )
Troubleshooting
No output produced
Check that the candidate script:
Accepts two arguments: objective and output_file
Writes to the output file path
Exits with code 0 on success
Timeout errors
Candidates have a 60-second timeout. If they exceed this:
# From forge.py:184-187
except subprocess.TimeoutExpired:
error_occurred = True
error_message = "Timeout after 60 seconds"
log( "[Error] Timeout!" )
Optimize slow candidates or increase the timeout in forge.py:158.
Daytona connection failed
Ensure your API key is valid:
export DAYTONA_API_KEY = "your_key_here"
Test with:
from src.daytona_runner import is_daytona_configured
print (is_daytona_configured()) # Should print True
Next Steps
Creating Agents Learn how to create your own agent implementations
Sandbox Execution Deep dive into Daytona sandbox integration