Template Types

Overview

Scribe automatically classifies email templates into three types to determine the appropriate enrichment strategy during pipeline execution.

Classification happens in Step 1 (TemplateParser) using Claude Haiku 4.5 with structured output.

TemplateType Enum

pipeline/models/core.py

class TemplateType(str, Enum):
    """Template type: RESEARCH (papers), BOOK (books), GENERAL (other)."""
    RESEARCH = "research"
    BOOK = "book"
    GENERAL = "general"

Inheriting from both str and Enum enables type-safe comparisons and automatic JSON serialization:

if pipeline_data.template_type == TemplateType.RESEARCH:
    # Fetch ArXiv papers

Template Type Descriptions

RESEARCH

Templates that reference academic papers, publications, or research work.Examples:

“I read your paper on ”
“Your work on inspired me”
“I’m interested in your research on ”

Enrichment:

ArXiv papers fetched in Step 3 (ArxivEnricher)
Top 5 most relevant papers selected
Papers used for email personalization

BOOK

Templates that mention books, textbooks, or published works (non-academic).Examples:

“I loved your book ”
“Your textbook on helped me”
“I’m a fan of your writing, especially ”

Enrichment:

ArXiv enrichment skipped
Web scraping focuses on published books
Goodreads/Amazon links prioritized

GENERAL

Templates that don’t mention papers or books. Generic outreach or networking.Examples:

“I’d love to chat about ”
“Your work at caught my attention”
“I’m reaching out about ”

Enrichment:

ArXiv enrichment skipped
Web scraping focuses on professional profiles
LinkedIn, university pages, personal websites prioritized

Classification Logic

The TemplateParser step uses an LLM to analyze the template and classify it:

pipeline/steps/template_parser/main.py

class TemplateParserStep(BasePipelineStep):
    """Analyze template and extract search parameters."""

    async def _execute_step(self, data: PipelineData) -> StepResult:
        agent = create_agent(
            model="anthropic:claude-haiku-4-5",
            output_type=TemplateAnalysis,  # Pydantic model with template_type field
            temperature=0.1,  # Low temperature for consistent classification
            max_tokens=2000
        )

        prompt = f"""
        Analyze this email template and recipient information:

        Template: {data.email_template}
        Recipient: {data.recipient_name}
        Interest: {data.recipient_interest}

        Classify the template type:
        - RESEARCH: Mentions papers, publications, research, studies, or academic work
        - BOOK: Mentions books, textbooks, or published non-academic works
        - GENERAL: Doesn't mention papers or books (networking, opportunities, etc.)

        Extract:
        1. Template type (RESEARCH, BOOK, or GENERAL)
        2. Search terms for finding information about the recipient
        3. Placeholders used ({{name}}, {{research}}, etc.)
        """

        result = await agent.run(prompt)

        # Update PipelineData with classification
        data.template_type = result.data.template_type
        data.search_terms = result.data.search_terms
        data.template_analysis = result.data.model_dump()

        return StepResult(success=True, step_name=self.step_name)

Classification Examples

RESEARCH Template

Input:

Template: "Hey {{name}}, I read your paper on {{research}} and found it fascinating!"
Recipient: Dr. Jane Smith
Interest: machine learning

Classification:

{
    "template_type": "RESEARCH",
    "confidence": 0.95,
    "reasoning": "Template explicitly mentions 'paper' and '{{research}}' placeholder",
    "search_terms": [
        "Dr. Jane Smith machine learning papers",
        "Jane Smith publications",
        "Jane Smith research"
    ]
}

Pipeline Behavior:

✅ Step 3 (ArxivEnricher) executes
Fetches top 5 ArXiv papers by Dr. Jane Smith on machine learning
Email mentions specific paper titles

BOOK Template

Input:

Template: "I loved your book {{book_title}}! It changed my perspective on {{topic}}."
Recipient: Malcolm Gladwell
Interest: psychology

Classification:

{
    "template_type": "BOOK",
    "confidence": 0.92,
    "reasoning": "Template mentions 'book' and {{book_title}} placeholder",
    "search_terms": [
        "Malcolm Gladwell books",
        "Malcolm Gladwell psychology",
        "Malcolm Gladwell publications"
    ]
}

Pipeline Behavior:

❌ Step 3 (ArxivEnricher) skipped
Web scraper prioritizes book titles, reviews, Goodreads
Email mentions specific book titles from web scraping

GENERAL Template

Input:

Template: "Hi {{name}}, I'd love to connect about opportunities in {{field}}."
Recipient: Elon Musk
Interest: space technology

Classification:

{
    "template_type": "GENERAL",
    "confidence": 0.88,
    "reasoning": "Template doesn't mention papers or books, focuses on networking",
    "search_terms": [
        "Elon Musk space technology",
        "Elon Musk SpaceX",
        "Elon Musk career"
    ]
}

Pipeline Behavior:

❌ Step 3 (ArxivEnricher) skipped
Web scraper prioritizes LinkedIn, company pages, recent news
Email mentions recent achievements or roles

Impact on Pipeline Execution

Step 3: ArxivEnricher Conditional Logic

pipeline/steps/arxiv_helper/main.py

class ArxivEnricherStep(BasePipelineStep):
    """Conditionally fetch ArXiv papers based on template type."""

    async def _execute_step(self, data: PipelineData) -> StepResult:
        # Only execute for RESEARCH templates
        if data.template_type != TemplateType.RESEARCH:
            logfire.info(
                f"Skipping ArXiv enrichment for template type: {data.template_type}",
                task_id=data.task_id
            )
            return StepResult(
                success=True,
                step_name=self.step_name,
                metadata={"papers_fetched": 0, "reason": "non-research template"}
            )

        # Fetch papers for RESEARCH templates
        papers = await self._fetch_arxiv_papers(
            author=data.recipient_name,
            interest=data.recipient_interest
        )

        data.arxiv_papers = papers[:5]  # Top 5 papers
        data.enrichment_metadata = {
            "papers_fetched": len(papers),
            "papers_used": len(data.arxiv_papers)
        }

        return StepResult(success=True, step_name=self.step_name)

Step 4: Email Composer Context Building

pipeline/steps/email_composer/main.py

class EmailComposerStep(BasePipelineStep):
    """Generate final email with type-aware context."""

    async def _execute_step(self, data: PipelineData) -> StepResult:
        # Build context based on template type
        context = {
            "template": data.email_template,
            "recipient_name": data.recipient_name,
            "recipient_interest": data.recipient_interest,
            "scraped_content": data.scraped_content,
            "template_type": data.template_type,
        }

        # Add papers only for RESEARCH type
        if data.template_type == TemplateType.RESEARCH:
            context["arxiv_papers"] = [
                {
                    "title": p["title"],
                    "abstract": p["abstract"][:500],  # Truncate
                    "year": p["year"],
                }
                for p in data.arxiv_papers[:3]  # Top 3 papers
            ]
            context["instruction"] = "Reference specific paper titles in the email."
        else:
            context["instruction"] = "Use information from web scraping for personalization."

        # Generate email with Sonnet
        email = await self._generate_email(context)

        # Validate with type-aware checks
        validation = self._validate_email(
            email=email,
            template_type=data.template_type,
            has_papers=len(data.arxiv_papers) > 0
        )

        # Write to database
        email_id = await self._write_to_database(data, email)

        return StepResult(success=True, step_name=self.step_name)

JobStatus Enum

While template types classify the content, job status tracks the execution state of the pipeline:

pipeline/models/core.py

class JobStatus(Enum):
    """Pipeline job status for Celery task state management."""
    PENDING = "pending"
    RUNNING = "running"
    COMPLETED = "completed"
    FAILED = "failed"

Don’t confuse JobStatus with QueueStatus: JobStatus is for Celery task state (used in pipeline execution), while QueueStatus is for database queue items (used in API responses).

Performance Impact

Execution Time by Type

Template Type	Avg Time	ArXiv Call?	Cost per Email
RESEARCH	10.4s	✅ Yes	$0.027
BOOK	9.6s	❌ No	$0.024
GENERAL	9.6s	❌ No	$0.024

Key Insight: RESEARCH templates add ~0.8s for ArXiv API call but provide significantly better personalization for academic outreach.

Best Practices

Be Explicit

Use clear keywords like “paper”, “research”, “book” in templates for accurate classification.

Test Classification

Review generated emails to verify the classifier chose the correct type for your template.

Match Recipient

Use RESEARCH for academics, BOOK for authors, GENERAL for industry professionals.

Monitor Accuracy

Track misclassifications in Logfire and adjust prompts if needed.

Debugging Template Classification

View classification decisions in the pipeline metadata:

# After email generation
email = db.query(Email).filter(Email.id == email_id).first()

print(email.metadata)
# Output:
# {
#   "template_type": "RESEARCH",
#   "classification_confidence": 0.95,
#   "papers_used": ["Neural Networks for CV", "Deep Learning in Healthcare"],
#   "arxiv_papers_fetched": 5,
#   "step_timings": {"template_parser": 1.2, "arxiv_helper": 0.8, ...}
# }

Pipeline Architecture

See how template types affect pipeline execution flow

Queue System

Learn how JobStatus differs from template classification

Get Started

Core Concepts

Guides

Development

Overview

TemplateType Enum

Template Type Descriptions

Classification Logic

Classification Examples

RESEARCH Template

BOOK Template

GENERAL Template

Impact on Pipeline Execution

Step 3: ArxivEnricher Conditional Logic

Step 4: Email Composer Context Building

JobStatus Enum

Performance Impact

Execution Time by Type

Best Practices

Be Explicit

Test Classification

Match Recipient

Monitor Accuracy

Debugging Template Classification

Pipeline Architecture

Queue System

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Development

​Overview

​TemplateType Enum

​Template Type Descriptions

​Classification Logic

​Classification Examples

​RESEARCH Template

​BOOK Template

​GENERAL Template

​Impact on Pipeline Execution

​Step 3: ArxivEnricher Conditional Logic

​Step 4: Email Composer Context Building

​JobStatus Enum

​Performance Impact

​Execution Time by Type

​Best Practices

Be Explicit

Test Classification

Match Recipient

Monitor Accuracy

​Debugging Template Classification

​Related Concepts

Pipeline Architecture

Queue System

Build docs developers (and LLMs) love

Overview

TemplateType Enum

Template Type Descriptions

Classification Logic

Classification Examples

RESEARCH Template

BOOK Template

GENERAL Template

Impact on Pipeline Execution

Step 3: ArxivEnricher Conditional Logic

Step 4: Email Composer Context Building

JobStatus Enum

Performance Impact

Execution Time by Type

Best Practices

Debugging Template Classification

Related Concepts