Skip to main content
Scribe Backend

Welcome to Scribe Backend

Scribe is a production-ready cold email platform built with FastAPI that uses a sophisticated multi-step AI pipeline to generate personalized academic outreach emails. The system combines web scraping, academic paper research, and Claude AI to create contextually relevant emails based on customizable templates.

Key Features

Multi-Step AI Pipeline

Four-stage agentic pipeline: template parsing, web scraping, academic enrichment, and email composition

Database-Backed Queue

Submit up to 100 email recipients at once with sequential processing and real-time status updates

Smart Web Scraping

Playwright-powered headless browser with JavaScript support for dynamic content extraction

ArXiv Integration

Automatic academic paper discovery and citation for research-focused outreach

Type-Safe Architecture

Pydantic models throughout with structured LLM outputs via pydantic-ai

Comprehensive Observability

Logfire integration with LLM call tracking for cost, tokens, and latency

Tech Stack

Scribe Backend is built with modern Python technologies:
  • Backend: FastAPI 0.109+, Python 3.13, Uvicorn
  • Database: PostgreSQL (Supabase), SQLAlchemy 2.0, Alembic
  • Authentication: Supabase Auth with JWT validation
  • AI/ML: Anthropic Claude (Haiku 4.5, Sonnet 4.5), pydantic-ai
  • Task Queue: Celery 5.3+, Redis 5.0+
  • Web Scraping: Playwright 1.56+, BeautifulSoup4, httpx
  • Observability: Logfire 4.14+ with auto-instrumentation

Quick Navigation

Quickstart

Get Scribe running in 5 minutes with step-by-step setup

Architecture

Understand the system design and deployment architecture

API Reference

Explore the complete REST API documentation

Pipeline Deep Dive

Learn about the 4-step email generation pipeline

Architecture Overview

┌─────────────────────┐
│  Template Parser    │  Claude Haiku extracts search terms & classifies type
│  (~1.2s)            │  Output: search_terms, template_type
└──────────┬──────────┘


┌─────────────────────┐
│   Web Scraper       │  Google Search + Playwright scraping + summarization
│   (~5.3s)           │  Output: scraped_content, urls, metadata
└──────────┬──────────┘


┌─────────────────────┐
│  ArXiv Helper       │  Fetch academic papers (if RESEARCH type)
│  (~0.8s)            │  Output: arxiv_papers[]
└──────────┬──────────┘


┌─────────────────────┐
│  Email Composer     │  Claude Sonnet generates final email
│  (~3.1s)            │  Output: final_email, writes to database
└─────────────────────┘

Total: ~10.4s

Template Types

Scribe supports three template types for different outreach scenarios:
  • Research: Includes research papers the professor has written
  • Book: Includes books the professor has published
  • General: General professional information about the professor
The pipeline automatically classifies your template and adjusts the enrichment strategy accordingly.

Production Deployment

The production backend is self-hosted on a Raspberry Pi with traffic routed through a Cloudflare Tunnel at https://scribeapi.manitmishra.com.

Health Check

GET /health - Check database connectivity and service status

API Documentation

GET /docs - Interactive Swagger UI for API exploration

Next Steps

1

Get Started

Follow the Quickstart guide to set up your local development environment
2

Understand Core Concepts

3

Explore the API

Check out the API Reference for detailed endpoint documentation
4

Dive into the Pipeline

Understand the 4-step pipeline that powers email generation

Build docs developers (and LLMs) love