Overview
Processes historical Gmail messages in chunks to extract past transactions. Designed for initial account setup or backfilling transaction history. Uses AI-powered extraction and self-invokes to process large volumes of emails without hitting timeout limits.Endpoint
Authentication
Requires valid Supabase authentication. The JWT token must be provided in theAuthorization header.
Request Modes
This function operates in two modes:- New Seed Mode - Starts a new seeding process
- Resume Mode - Continues an existing seeding process
Headers
Bearer token with Supabase JWTFormat:
Bearer {token}Must be
application/jsonRequest Body
New Seed Mode
ID of the OAuth token connection (from
user_oauth_tokens table)Resume Mode
ID of the existing seed to continue processing
Must be
true to enable resume modeExample Requests
Start New Seed
Resume Existing Seed
Response
New Seed Success (202 Accepted)
Returned when a new seed is created and processing begins:Unique ID of the created seed
Current status: “processing”, “completed”, or “failed”
Whether all emails have been processed
Number of emails processed so far
Number of transactions found so far
Total number of emails to process
Resume Success (200 OK)
Returned when resuming an existing seed:Completion Response
When all emails are processed:No Messages Found
Error Responses
400 Bad Request
Missing Connection ID
403 Forbidden
Unauthorized Seed Access
404 Not Found
Connection Not Found
405 Method Not Allowed
409 Conflict
Seed Already In Progress
Gmail Reconnection Required
500 Internal Server Error
Seed Creation Failed
Chunk Processing Failed
General Error
Processing Flow
New Seed Flow
1. Validation
- Verifies
connectionIdbelongs to authenticated user - Checks for existing in-progress seeds
- Ensures OAuth token is active
2. Message ID Collection
Fetches all message IDs from Gmail API for the past 3 months:3. Seed Creation
Creates record inseeds table:
4. Chunk Processing
Processes first chunk of 30 messages.5. Auto-Invocation
If more chunks remain, automatically invokes itself with resume mode (fire-and-forget):Resume Flow
1. Seed Verification
- Verifies
seedIdbelongs to authenticated user - Checks seed status (must not be completed or failed)
2. Chunk Processing
Processes next chunk of 30 messages starting fromlast_processed_index.
3. Database Update
Updates seed record:4. Completion Check
If all messages processed:- Sets
statusto “completed” - Creates system notification
- Returns
done: true
Message Processing
Parallel Processing
Processes messages in parallel batches of 10 (CONCURRENCY):Per-Message Flow
- Fetch Message - Get full message details from Gmail API
- Duplicate Check - Skip if already processed (in
transactionsordiscarded_emails) - Label Filtering - Skip if not in INBOX or in SPAM/TRASH
- Content Extraction - Extract subject, sender, body, attachments
- AI Analysis - Extract transaction data with
extractTransactionFromEmail() - Storage - Store transaction or mark as discarded
- Langfuse Flush - Flush AI observability events
Transaction Storage
If AI detects transaction:Discarded Storage
If no transaction detected:Fallback Processing
If AI fails, uses keyword-based detection (same as gmail-webhook).Configuration
Constants
Number of months of email history to process
Number of messages processed per chunk
Number of messages processed in parallel within each chunk
Date Query Format
Gmail query uses format:after:YYYY/MM/DD
Example for 3 months ago:
Notifications
System notifications are created at key points:Seed Completed (with transactions)
Seed Completed (no new transactions)
Seed Failed
Implementation Details
Environment Variables Required
SUPABASE_URL- Supabase project URLSUPABASE_SERVICE_ROLE_KEY- Service role key for database access
Token Refresh
Ensures fresh access tokens before each Gmail API call:Error Recovery
Handles token reconnection gracefully:- Catches
GmailReconnectRequiredError - Updates seed status to “failed”
- Returns 409 Conflict with
reconnectRequired: true - Creates system notification for user
Self-Invocation Pattern
Uses fire-and-forget HTTP request to avoid timeout limits:Database Transactions
Usesupsert operations to handle duplicate processing gracefully.
User Context
Retrieves user metadata for AI personalization:Best Practices
Frontend Integration
- Poll for Progress - Periodically query seed status while
status === 'processing' - Show Progress - Display
processed / totalto user - Handle Errors - Check for
reconnectRequiredand prompt re-authentication - Prevent Duplicates - Check for existing in-progress seeds before starting new ones
Performance Optimization
- Chunking prevents timeout on large mailboxes
- Parallel processing speeds up AI analysis
- Auto-invocation distributes load across multiple function instances
Error Handling
- Monitor
statusfield in seeds table - Check
error_messagefor failure details - Implement retry logic for 500 errors
- Handle token expiration with reconnection flow
Related Functions
- gmail-webhook - Real-time processing of new emails
- process-document - Similar AI processing for uploaded files
- auth-callback - Initial OAuth connection setup