Architecture Decision Records

Overview

This document records the major architectural decisions made during JARVIS development, along with the reasoning behind each choice.

ADR-001: Python FastAPI Backend

Status: Accepted Context: Need to choose a backend framework for the API and orchestration layer. Decision: Use Python with FastAPI instead of Node.js/Express or other alternatives. Rationale:

Browser Use SDK is Python - Core dependency requires Python
ML/AI ecosystem - Better integration with face detection (mediapipe), computer vision (OpenCV), and AI libraries
Async support - FastAPI has excellent async/await support for concurrent operations
Type safety - Pydantic provides runtime type validation
Developer experience - Auto-generated OpenAPI docs, interactive testing

Alternatives Considered:

Node.js/Express - Would require wrapping Python Browser Use via subprocess
Go - Limited ML/AI library support

Consequences:

✅ Native integration with Browser Use agents
✅ Rich ML/AI ecosystem
✅ Excellent async performance
❌ Two-language stack (Python + TypeScript)

ADR-002: Convex for Real-time Data

Status: Accepted Context: Need real-time updates to stream intelligence data to the frontend as agents complete research. Decision: Use Convex for real-time database instead of WebSockets + MongoDB Change Streams. Rationale:

Zero boilerplate - Real-time subscriptions out of the box
Automatic reconnection - Handles network issues gracefully
Delta updates - Only changed data is sent to clients
TypeScript integration - Generated types for database schema
Serverless - No infrastructure to manage
Free tier - Sufficient for hackathon and initial development

Alternatives Considered:

WebSockets + MongoDB - More complex, requires connection management
Firebase Realtime Database - Vendor lock-in, less flexible schema
Supabase - Good option, but less optimized for real-time

Consequences:

✅ Instant real-time updates with zero setup
✅ Great developer experience
❌ Vendor-specific (requires Convex account)
❌ Less control over data layer

ADR-003: Gemini 2.0 Flash for Synthesis

Status: Accepted Context: Need to choose LLM for report synthesis from multiple data sources. Decision: Use Gemini 2.0 Flash instead of GPT-4 for report synthesis. Rationale:

Cost - 25x cheaper than GPT-4 ( $0.001 vs$ 0.03 per 1K tokens)
Speed - 2x faster than GPT-4 (~0.8s vs ~2s)
Quality - Sufficient for structured synthesis tasks
Long context - 1M token context window for large documents

Alternatives Considered:

GPT-4 - Better quality, but much more expensive
Claude 3.5 Sonnet - Good middle ground, but slower than Gemini
GPT-3.5 - Cheaper but lower quality

Consequences:

✅ Significantly lower costs
✅ Faster response times
✅ Can process more data per request
❌ Slightly lower quality than GPT-4 (acceptable trade-off)

ADR-004: Two-Tier Research Architecture

Status: Accepted Context: Need to balance speed (fast initial results) with depth (comprehensive research). Decision: Implement two-tier research: fast API enrichment (Tier 1) followed by deep browser research (Tier 2). Architecture:

Tier 1 (< 1s): Exa API → LinkedIn URL, company, title
       ↓
Tier 2 (10-60s): Browser Use agents → Full profiles

Rationale:

Perceived performance - Users see results within 1 second
Context for agents - Tier 1 data improves Tier 2 accuracy
Fallback - If Tier 2 fails, Tier 1 results are still useful
Streaming UX - Data appears incrementally, not all at once

Alternatives Considered:

Single tier (browser only) - Too slow, no early feedback
API-only - Not comprehensive enough

Consequences:

✅ Fast time-to-first-result
✅ Better user experience
✅ More resilient (partial results always available)
❌ More complex orchestration logic

ADR-005: MediaPipe for Face Detection

Status: Accepted Context: Need to detect faces in images from camera capture. Decision: Use MediaPipe Face Detection instead of dlib, MTCNN, or commercial APIs. Rationale:

Speed - 5-10ms per frame (10x faster than alternatives)
Accuracy - 95%+ detection rate
Easy installation - Single pip install, no compilation
Free - No API costs
Cross-platform - Works on Linux, macOS, Windows

Alternatives Considered:

dlib - 50-100ms, complex C++ dependencies
MTCNN - 100ms, lower accuracy
Azure Face API - Costs, requires network call

Consequences:

✅ Fast detection enables real-time processing
✅ Simple deployment
✅ No API costs
❌ Slightly lower accuracy than commercial APIs (acceptable)

ADR-006: Parallel Agent Execution

Status: Accepted Context: Multiple agents need to research a person across different platforms. Decision: Run all agents in parallel using asyncio.gather() instead of sequential execution. Implementation:

results = await asyncio.gather(
    linkedin_agent.run(),
    twitter_agent.run(),
    instagram_agent.run(),
    return_exceptions=True  # Continue on failure
)

Rationale:

Performance - 3-5x speedup (20s vs 60-100s for 3 agents)
Resilience - return_exceptions=True prevents one failure from blocking others
Better UX - Results stream in as they complete

Alternatives Considered:

Sequential - Too slow, blocks on failures
Queue-based - More complex, no performance benefit

Consequences:

✅ Dramatically faster research
✅ Continues on individual agent failures
❌ Higher concurrent load on external APIs

ADR-007: Streaming Results via Convex

Status: Accepted Context: Agents complete at different times. Users should see results as they arrive. Decision: Stream individual agent results to Convex immediately, don’t wait for all agents. Implementation:

async def _run_agent(self, agent):
    result = await agent.run()
    # Stream immediately
    await self.convex.mutation("intelFragments:create", {
        "personId": self.person_id,
        "source": agent.SOURCE_NAME,
        "content": result,
    })
    return result

Rationale:

Better UX - Data appears incrementally
Demo impact - Visually impressive to see data streaming in
Resilience - Partial results preserved even if pipeline fails

Alternatives Considered:

Batch updates - Slower, less impressive demo
Polling - Inefficient, higher latency

Consequences:

✅ Excellent user experience
✅ More resilient to failures
✅ Great for live demos
❌ More Convex mutations (within free tier)

ADR-008: Laminar for Observability

Status: Accepted Context: Need to trace LLM calls and debug agent behavior. Decision: Use Laminar for LLM observability instead of generic APM tools. Rationale:

LLM-specific - Captures prompts, responses, tokens, costs
Agent tracing - Tracks multi-step agent workflows
Accuracy verification - Detect hallucinations
Simple integration - Single decorator (@observe)
Hackathon credits - $150 in free credits

Alternatives Considered:

DataDog/New Relic - Generic, doesn’t capture LLM specifics
LangSmith - Good, but Laminar has better Python SDK
Custom logging - Too much work, no visualization

Consequences:

✅ Deep visibility into LLM behavior
✅ Easy debugging of agent failures
✅ Cost tracking
❌ Another service dependency

ADR-009: MongoDB for Persistent Storage

Status: Accepted Context: Need to store raw images, capture metadata, and archival person records. Decision: Use MongoDB Atlas (free tier) for persistent storage alongside Convex. Rationale:

Document-oriented - Natural fit for person records (varying schemas)
GridFS - Built-in support for storing large files (images)
Free tier - 512MB storage, sufficient for development
Convex complement - Convex for real-time, MongoDB for archival

Alternatives Considered:

PostgreSQL - Relational model less suitable for varying schemas
S3 + PostgreSQL - More complex, requires managing two services
Only Convex - No good solution for large binary storage

Consequences:

✅ Flexible schema for person records
✅ Built-in binary storage
✅ Free tier sufficient
❌ Two database systems to manage

ADR-010: Next.js for Frontend

Status: Accepted Context: Need to build an interactive corkboard UI with real-time updates. Decision: Use Next.js 14 with App Router instead of Create React App or other frameworks. Rationale:

React Server Components - Better performance
Built-in routing - No need for React Router
Vercel deployment - One-click deploy, free hosting
TypeScript - First-class TypeScript support
Convex integration - Excellent React hooks

Alternatives Considered:

Create React App - Deprecated, no SSR
Remix - Good, but less mature ecosystem
Svelte/SvelteKit - Smaller ecosystem

Consequences:

✅ Fast development with great DX
✅ Excellent performance
✅ Easy deployment
❌ Learning curve for App Router

ADR-011: Timeouts for All Agents

Status: Accepted Context: Browser agents can hang or take arbitrarily long. Decision: Set aggressive timeouts (3 minutes max) on all agent operations. Implementation:

try:
    results = await asyncio.wait_for(
        asyncio.gather(*agents),
        timeout=180  # 3 minutes
    )
except asyncio.TimeoutError:
    # Return partial results
    pass

Rationale:

Bounded latency - Guarantee response within 3 minutes
Better UX - Users don’t wait forever
Resource efficiency - Don’t waste compute on stuck operations
Partial results - Streaming means we have data even on timeout

Alternatives Considered:

No timeout - Risk of hung operations
Per-agent timeout - More complex, less predictable

Consequences:

✅ Predictable performance
✅ Better resource utilization
❌ May miss some data if agent is slow

ADR-012: Loguru for Logging

Status: Accepted Context: Need structured logging for debugging and monitoring. Decision: Use Loguru instead of standard library logging. Rationale:

Better DX - Simpler API than stdlib logging
Structured logging - Native support for .bind()
Colored output - Easier to read during development
Async-safe - Works well with FastAPI
Exception capture - logger.exception() includes full traceback

Alternatives Considered:

Standard library logging - More boilerplate, less intuitive
structlog - Good, but more complex setup

Consequences:

✅ Excellent developer experience
✅ Rich context in logs
✅ Easy debugging
❌ Non-standard (but popular) library

ADR-013: Ruff for Code Quality

Status: Accepted Context: Need linting and formatting for Python code. Decision: Use Ruff for both linting and formatting instead of Black + flake8 + isort. Rationale:

Speed - 10-100x faster than alternatives (written in Rust)
All-in-one - Replaces multiple tools
Compatible - Black-compatible formatting
Modern - Supports latest Python features

Alternatives Considered:

Black + flake8 + isort - Slower, multiple tools
Pylint - Slower, more opinionated

Consequences:

✅ Fast linting/formatting
✅ Single tool to learn
✅ Great performance in CI
❌ Relatively new (but rapidly maturing)

ADR-014: Browser Use Cloud Sessions

Status: Accepted Context: Social platforms require authentication. Managing auth in headless browsers is complex. Decision: Use Browser Use Cloud with persistent profile for authenticated sessions. Setup:

BROWSER_USE_API_KEY=your-key
BROWSER_USE_PROFILE_ID=your-profile

# One-time: Login interactively
browseruse login
# Sessions persist in cloud profile

Rationale:

Persistent auth - Login once, reuse across agents
No credential management - Browser handles cookies/tokens
Anti-detection - Real browser fingerprint
Hackathon credits - $100 free credits

Alternatives Considered:

Local Selenium - Complex auth management
API keys - Not available for all platforms
Manual login per run - Too slow

Consequences:

✅ Reliable authenticated sessions
✅ No credential storage in code
✅ Better anti-bot evasion
❌ Requires Browser Use Cloud account
❌ Sessions shared across all agents

Summary

These architectural decisions shaped JARVIS into:

Fast - Two-tier architecture, parallel agents, streaming results
Reliable - Timeouts, graceful degradation, comprehensive logging
Observable - Laminar tracing, structured logging
Developer-friendly - Type safety, great tooling, clear patterns

Future ADRs

Potential future decisions to document:

Caching strategy (Redis vs in-memory)
Multi-region deployment
Rate limiting implementation
Background job processing (Celery vs native async)

Contributing

When making significant architectural changes:

Create a new ADR in this document
Follow the format:
- Status (Proposed/Accepted/Deprecated)
- Context
- Decision
- Rationale
- Alternatives Considered
- Consequences
Include in PR description
Discuss in PR review

Architecture Overview

Understand the overall system architecture

Contributing

Guidelines for contributing to JARVIS

Performance

Performance optimization techniques

Local Setup

Set up your development environment

Setup

Advanced

Contributing

Architecture Decision Records

Overview

ADR-001: Python FastAPI Backend

ADR-002: Convex for Real-time Data

ADR-003: Gemini 2.0 Flash for Synthesis

ADR-004: Two-Tier Research Architecture

ADR-005: MediaPipe for Face Detection

ADR-006: Parallel Agent Execution

ADR-007: Streaming Results via Convex

ADR-008: Laminar for Observability

ADR-009: MongoDB for Persistent Storage

ADR-010: Next.js for Frontend

ADR-011: Timeouts for All Agents

ADR-012: Loguru for Logging

ADR-013: Ruff for Code Quality

ADR-014: Browser Use Cloud Sessions

Summary

Future ADRs

Contributing

Architecture Overview

Contributing

Performance

Local Setup

Build docs developers (and LLMs) love

Setup

Advanced

Contributing

​Overview

​ADR-001: Python FastAPI Backend

​ADR-002: Convex for Real-time Data

​ADR-003: Gemini 2.0 Flash for Synthesis

​ADR-004: Two-Tier Research Architecture

​ADR-005: MediaPipe for Face Detection

​ADR-006: Parallel Agent Execution

​ADR-007: Streaming Results via Convex

​ADR-008: Laminar for Observability

​ADR-009: MongoDB for Persistent Storage

​ADR-010: Next.js for Frontend

​ADR-011: Timeouts for All Agents

​ADR-012: Loguru for Logging

​ADR-013: Ruff for Code Quality

​ADR-014: Browser Use Cloud Sessions

​Summary

​Future ADRs

​Contributing

Architecture Overview

Contributing

Performance

Local Setup

Build docs developers (and LLMs) love

Overview

ADR-001: Python FastAPI Backend

ADR-002: Convex for Real-time Data

ADR-003: Gemini 2.0 Flash for Synthesis

ADR-004: Two-Tier Research Architecture

ADR-005: MediaPipe for Face Detection

ADR-006: Parallel Agent Execution

ADR-007: Streaming Results via Convex

ADR-008: Laminar for Observability

ADR-009: MongoDB for Persistent Storage

ADR-010: Next.js for Frontend

ADR-011: Timeouts for All Agents

ADR-012: Loguru for Logging

ADR-013: Ruff for Code Quality

ADR-014: Browser Use Cloud Sessions

Summary

Future ADRs

Contributing