Overview
The Template Parser is the first step in the email generation pipeline. It uses Claude Haiku 4.5 to analyze the email template and recipient information, extracting structured data needed for subsequent steps. Purpose:- Extract search terms for web scraping
- Classify template type (RESEARCH/BOOK/GENERAL)
- Identify placeholders in the template
Model: Claude Haiku 4.5
Temperature: 0.1 (low for consistent structured output)
Input Schema
The Template Parser requires these fields fromPipelineData:
The email template with placeholders (e.g.,
{{name}}, {{research}})Constraints:- Min length: 20 characters
- Max length: 5,000 characters
Name of the email recipient (e.g., “Dr. Jane Smith”)Constraints:
- Must not be empty
- Trimmed of whitespace
Recipient’s research interest or topic area (e.g., “machine learning in healthcare”)Constraints:
- Must not be empty
- Trimmed of whitespace
Output Schema
The Template Parser updatesPipelineData with:
Array of search terms optimized for finding information about the recipientExample:
Classified template type based on content analysisValues:
RESEARCH- Template mentions research papers or publicationsBOOK- Template mentions books or authored worksGENERAL- General template without specific content requirements
Detailed analysis metadata including:
placeholders- List of placeholders found (e.g.,["name", "research"])local_placeholders- Regex-extracted placeholders for debugging
Implementation Details
Pydantic-AI Agent
The step uses a structured output agent that automatically validates responses:pipeline/steps/template_parser/main.py:37-45
Execution Flow
- Validate Input - Check required fields and template length constraints
- Extract Local Placeholders - Regex-based extraction for comparison
- Create User Prompt - Combine template, recipient name, and interest
- Call LLM Agent - Structured output with automatic validation
- Update Pipeline Data - Store search terms, template type, and analysis
- Return Success - With metadata about model used and result counts
pipeline/steps/template_parser/main.py:67-135
Structured Output Model
The agent returns data validated against this Pydantic model:pipeline/steps/template_parser/models.py
Error Handling
Fatal Errors (Pipeline Stops)
Retry Strategy
The Pydantic-AI agent automatically retries on:- API connection errors
- Validation failures (invalid JSON, missing fields)
- Timeout errors
- Max retries: 3
- Timeout: 60 seconds per attempt
- Exponential backoff between retries
pipeline/steps/template_parser/main.py:43-44
Logging & Observability
The step emits structured logs to Logfire:pipeline/steps/template_parser/main.py:73-102
Tracked Metrics
- Template length (characters)
- Placeholder count (local vs. LLM-detected)
- Search term count
- Template type classification
- Model used
- Execution duration
Configuration
The step is configurable via environment variables:config/settings.py
Next Steps
After the Template Parser completes:- Web Scraper uses
search_termsto find information about the recipient - ArXiv Helper conditionally fetches papers if
template_type == RESEARCH - Email Composer uses
template_analysisto fill placeholders
Next: Web Scraper
Learn how search terms are used to fetch and summarize web content
