Overview
Scribe follows a modular, domain-driven architecture with clear separation between API routes, business logic, and data models.Directory Tree
Core Components
Entry Point
main.py
main.py
Purpose: FastAPI application initialization and configurationKey Responsibilities:
- Create FastAPI app instance
- Configure CORS middleware
- Register API routers
- Initialize Logfire observability
- Mount static files (if any)
- Health check endpoint
celery_config.py
celery_config.py
Purpose: Celery task queue configurationKey Configuration:
- Redis broker URL
- Result backend
- Task serialization (JSON)
- Task routing (email_default queue)
- Timezone settings
API Layer
api/routes/
api/routes/
Purpose: FastAPI route handlers (controllers)Structure:
- user.py: User initialization and profile management
- email.py: Email generation and retrieval
- queue.py: Batch submission, status polling, cancellation
- template.py: AI template generation from resumes
api/dependencies.py
api/dependencies.py
Purpose: Reusable dependencies for authenticationKey Functions:
get_supabase_user(): Validates JWT and returns Supabase userget_current_user(): Fetches user from local databaseget_db(): Database session dependency
Data Layer
models/
models/
Purpose: SQLAlchemy ORM models (database schema)Key Models:
- User: User accounts (synced with Supabase auth.users)
- Email: Generated emails with JSONB metadata
- Template: User-created email templates
- QueueItem: Batch processing queue with status tracking
schemas/
schemas/
Purpose: Pydantic schemas for request/response validationKey Features:
- Automatic validation
- Type coercion
- OpenAPI documentation
- Serialization/deserialization
database/
database/
Purpose: Database configuration and utilitiesFiles:
- base.py: SQLAlchemy engine and Base class
- session.py: Session factory and context managers
- dependencies.py: FastAPI database dependencies
- utils.py: Health checks, connection testing
Pipeline Architecture
pipeline/core/
pipeline/core/
Purpose: Base classes and infrastructure for pipeline stepsKey Components:
- runner.py:
BasePipelineStepabstract class,PipelineRunnerorchestrator - exceptions.py: Custom exceptions (
ValidationError,StepExecutionError, etc.)
pipeline/models/
pipeline/models/
Purpose: Data models for pipeline executionKey Classes:
- PipelineData: Dataclass holding all pipeline state (in-memory)
- StepResult: Result object from each step (success, metadata, error)
- TemplateType: Enum (RESEARCH, BOOK, GENERAL)
pipeline/steps/
pipeline/steps/
Purpose: 4-step email generation pipelineSteps:
- template_parser: Analyze template, extract search terms, classify type
- web_scraper: Google Search + Playwright scraping + summarization
- arxiv_helper: Fetch academic papers (if RESEARCH type)
- email_composer: Generate final email and write to database
main.py: Step implementation (inheritsBasePipelineStep)tests/: Step-specific tests
Task Queue
tasks/email_tasks.py
tasks/email_tasks.py
Purpose: Celery task definitionsKey Tasks:
generate_email_task: Orchestrates 4-step pipeline- Updates queue_items status in database
- Handles errors and retries
Configuration
config/settings.py
config/settings.py
Purpose: Centralized configuration using Pydantic SettingsFeatures:
- Type-safe environment variables
- Validation at startup
- Automatic .env file loading
- Computed properties (e.g.,
database_url)
Testing
Test Organization
Test Organization
Structure:See Testing Guide for more details.
- Unit tests live alongside source code in
tests/subdirectories - Integration tests in top-level
tests/integration/ - Global fixtures in
conftest.py
Code Organization Principles
1. Separation of Concerns
2. Dependency Flow
3. Backend-First Architecture
- Frontend: Supabase SDK for auth ONLY (OAuth, JWT)
- Backend: All database operations via SQLAlchemy
- Authentication: JWT validated on every request, user_id extracted
- Database: No direct access from frontend
4. Type Safety Throughout
- Pydantic models for API validation
- SQLAlchemy models for database operations
- Type hints on all function signatures
- Structured LLM outputs via pydantic-ai
File Naming Conventions
| Type | Pattern | Example |
|---|---|---|
| Models | {entity}.py | models/email.py |
| Schemas | {domain}.py | schemas/pipeline.py |
| Routes | {resource}.py | api/routes/user.py |
| Tests | test_{module}.py | test_template_parser.py |
| Migrations | {revision}_{description}.py | 002_add_queue_items.py |
| Fixtures | conftest.py | pipeline/conftest.py |
Import Patterns
Absolute Imports (Recommended)
Relative Imports (Avoid)
Next Steps
Development Setup
Set up your local development environment
Testing Guide
Learn how to write and run tests
Pipeline Deep Dive
Understand the 4-step email generation pipeline
API Reference
Explore the REST API endpoints
