Skip to main content

Overview

This guide will take you from zero to capturing and identifying your first person with JARVIS in under 5 minutes. You’ll see the real-time corkboard interface populate with intelligence as agents research in parallel.
This quickstart assumes you’ve completed the Installation guide. If not, start there first.

What You’ll Build

By the end of this guide, you’ll have:
  • ✅ Backend server running with face detection
  • ✅ Frontend corkboard interface
  • ✅ Real-time Convex connection
  • ✅ Your first person capture and dossier

Start the Backend

1

Navigate to backend directory

cd backend
2

Verify environment is configured

Ensure your .env file exists with at minimum:
# Required
CONVEX_URL=https://your-project.convex.cloud
OPENAI_API_KEY=sk-...
# OR
GEMINI_API_KEY=...
Without CONVEX_URL and at least one AI provider key, identification and synthesis won’t work.
3

Start the FastAPI server

uv run uvicorn main:app --reload --port 8000
You should see:
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [xxxxx] using StatReload
INFO:     Started server process [xxxxx]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
4

Verify backend is healthy

In a new terminal window:
curl http://localhost:8000/api/health
Expected response:
{
  "status": "healthy",
  "version": "0.1.0",
  "environment": "development",
  "services": {
    "database": "convex",
    "face_detector": "ready",
    "face_embedder": "ready",
    "face_searcher": "configured",
    "synthesis": "claude-3.5-sonnet"
  }
}

Start the Frontend

1

Open a new terminal

Keep the backend running and open a new terminal window.
2

Navigate to frontend directory

cd frontend
3

Start the Next.js development server

npm run dev
You should see:
▲ Next.js 16.1.6
- Local:        http://localhost:3000
- Environments: .env

✓ Starting...
✓ Ready in 1.2s
4

Open the interface

Navigate to http://localhost:3000 in your browser.You should see the JARVIS corkboard interface - a cork texture background ready to display person cards.

Your First Capture

Now let’s capture and identify a person. We’ll use a photo to test the full pipeline.
1

Prepare a test image

Find a photo of a person (you can use a photo of yourself, a colleague, or a public figure). Save it as test-person.jpg.
For best results, use a clear frontal photo with good lighting and a visible face.
2

Submit the capture

Use the API to submit your first capture:
curl -X POST http://localhost:8000/api/capture \
  -F "[email protected]" \
  -F "source=manual"
You should get a response like:
{
  "capture_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "message": "Capture received, processing..."
}
3

Watch the corkboard come to life

Switch back to your browser at http://localhost:3000.You should see:
  1. Initial spawn (1-2 sec): A paper card slides onto the corkboard with the photo and “Identifying…” status
  2. Name appears (5-10 sec): The person’s name is identified and appears on the card
  3. Data streams in (10-60 sec): Intel fragments appear as agents research LinkedIn, Twitter, web sources
  4. Dossier complete (30-90 sec): The full dossier is synthesized and the card is marked complete
The first capture may take longer as models initialize. Subsequent captures will be faster.
4

Click to view full dossier

Click on the person card to zoom into the full dossier view. You’ll see:
  • Summary: 2-3 sentence overview
  • Current role: Title and company
  • Work history: Previous positions
  • Education: Schools and degrees
  • Social profiles: Links to LinkedIn, Twitter, etc.
  • Notable activity: Recent posts, articles, projects
  • Conversation hooks: Suggested topics based on recent activity

Understanding the Pipeline

Here’s what happened behind the scenes:

Face Detection

MediaPipe detected the face in your image and extracted facial landmarks.File: backend/identification/detector.py:34

Face Embedding

ArcFace generated a 512-dimensional embedding vector representing the face.File: backend/identification/embedder.py:28

Reverse Image Search

The face was searched using PimEyes or Google reverse image search to find matching profiles.File: backend/identification/search_manager.py:67

Initial Identification

GPT-4o Vision analyzed search results to extract the person’s name and likely social profiles.File: backend/identification/vision.py:45

Agent Swarm Launch

The orchestrator spawned multiple Browser Use agents in parallel to research:
  • LinkedIn profile
  • Twitter activity
  • Instagram presence
  • Web mentions
  • Company information (via Exa API)
File: backend/agents/orchestrator.py:112

Real-time Streaming

As each agent completed, results were streamed to Convex and immediately appeared in the frontend.File: backend/pipeline.py:234

Synthesis

Claude aggregated all research into a coherent dossier with deduplication and confidence scoring.File: backend/synthesis/anthropic_engine.py:89

Check the Logs

JARVIS logs everything for debugging and observability.

Backend Logs

# Real-time log streaming
tail -f /tmp/jarvis_backend.log
You’ll see:
  • Face detection results
  • Identification confidence scores
  • Agent task assignments
  • LLM API calls
  • Errors and warnings

Convex Dashboard

Open the Convex dashboard to see real-time data:
cd frontend
npx convex dashboard
This shows:
  • captures table: All submitted images
  • persons table: Identified people with dossiers
  • intelFragments table: Raw research data from agents
  • connections table: Relationships between people

Capture a Second Person

Let’s see relationship detection in action.
1

Submit another capture

Find a photo of someone related to your first person (coworker, co-founder, mutual connection):
curl -X POST http://localhost:8000/api/capture \
  -F "[email protected]" \
  -F "source=manual"
2

Watch for connection strings

If the two people have a relationship (worked together, mutual follows, co-authored papers), you’ll see:
  • A red string drawn between their cards on the corkboard
  • Connection details in the relationship description

Using the Live Feed

1

Open the Live Feed sidebar

Click the activity icon in the top-right corner of the corkboard.
2

See real-time updates

The live feed shows:
  • New captures submitted
  • Identification events
  • Agent completions
  • Dossier syntheses
  • Connection discoveries

Stream Mode (WebSocket)

For real-time updates during capture, use the streaming endpoint:
# Install a WebSocket client
npm install -g wscat

# Connect to stream endpoint
wscat -c "ws://localhost:8000/api/stream"

# In another terminal, submit a capture
curl -X POST http://localhost:8000/api/capture \
  -F "[email protected]" \
  -F "source=manual"
You’ll receive Server-Sent Events as the pipeline progresses:
{"event": "capture_received", "capture_id": "..."}
{"event": "face_detected", "confidence": 0.98}
{"event": "identification_started"}
{"event": "person_identified", "name": "John Doe", "confidence": 0.85}
{"event": "agent_started", "agent": "linkedin", "status": "researching"}
{"event": "intel_fragment", "source": "linkedin", "data": {...}}
{"event": "synthesis_complete", "person_id": "..."}

Telegram Bot (Optional)

If you have a Telegram bot configured, you can capture directly from Telegram:
1

Send a photo to your bot

  1. Open Telegram
  2. Find your JARVIS bot (the one you configured with TELEGRAM_BOT_TOKEN)
  3. Send a photo of a person
2

Receive the dossier

The bot will reply with:
  • Person’s name and confidence score
  • Link to full dossier on the corkboard
  • Brief summary

Next Steps

Architecture Deep Dive

Understand the pipeline components and data flow

API Reference

Explore all backend endpoints

Configuration

Configure agent behavior, timeouts, and thresholds

Meta Glasses Setup

Connect Meta Ray-Ban glasses for hands-free capture

Common Issues

Possible causes:
  • Photo quality is too low
  • Face is not frontal (profile shots don’t work well)
  • Face is obscured by sunglasses, mask, or hair
Solutions:
  • Use a clearer, higher-resolution photo
  • Ensure the face is directly facing the camera
  • Remove obstructions
Configuration: You can adjust detection sensitivity in backend/config.py:78:
FACE_DETECTION_CONFIDENCE = 0.7  # Lower = more lenient
Possible causes:
  • Reverse image search returned wrong results
  • Vision LLM misinterpreted search results
  • Insufficient context to disambiguate common names
Solutions:
  • Provide more context in the capture request:
    curl -X POST http://localhost:8000/api/capture \
      -F "[email protected]" \
      -F "context={\"company\": \"Acme Corp\", \"location\": \"San Francisco\"}"
    
  • Check Laminar traces to see search results: backend/observability/laminar.py
Possible causes:
  • Browser Use cloud API rate limits
  • Target website blocking automation
  • Network issues
Solutions:
  • Check your Browser Use credits at cloud.browser-use.com
  • Increase timeouts in backend/config.py:
    AGENT_TIMEOUT_SECONDS = 180  # Default is 180s
    
  • Enable verbose logging:
    SPECTER_LOG_LEVEL=DEBUG uv run uvicorn main:app --reload
    
Expected behavior: JARVIS synthesizes dossiers from public sources. Results depend on:
  • Public information availability
  • Agent success rate
  • LLM synthesis quality
Improvements:
  • Add more research agents in backend/agents/
  • Configure SuperMemory for cross-session memory:
    SUPERMEMORY_API_KEY=...
    
  • Enable Exa API for structured company data:
    EXA_API_KEY=...
    
Check Convex connection:
  1. Open browser DevTools → Network tab
  2. Look for WebSocket connection to convex.cloud
  3. Should show “101 Switching Protocols”
If no WebSocket:
  • Verify NEXT_PUBLIC_CONVEX_URL in .env
  • Check Convex dashboard: npx convex dashboard
  • Restart frontend: npm run dev
Fallback: Frontend works without Convex by polling the backend:
// frontend/src/app/page.tsx:67
// Falls back to demo data if Convex unavailable

Advanced Usage

Batch Processing

Process multiple photos at once:
for file in photos/*.jpg; do
  curl -X POST http://localhost:8000/api/capture \
    -F "file=@$file" \
    -F "source=batch"
  sleep 2  # Rate limit
done

Custom Agent Orchestration

Create a custom research agent:
# backend/agents/custom_agent.py
from agents.base import ResearchAgent
from browser_use import Agent, Browser

class CustomAgent(ResearchAgent):
    async def research(self, person_name: str, context: dict) -> dict:
        # Your custom research logic
        browser = Browser(headless=True)
        agent = Agent(
            task=f"Research {person_name} on CustomSite.com",
            llm=self.llm,
            browser=browser
        )
        result = await agent.run()
        return self.parse_result(result)
Register it in the orchestrator:
# backend/agents/orchestrator.py:45
from agents.custom_agent import CustomAgent

agents.append(CustomAgent(settings))

Export Dossiers

Export person data as JSON:
curl http://localhost:8000/api/persons/{person_id} > dossier.json
Or export all persons:
curl http://localhost:8000/api/persons > all-dossiers.json

Production Deployment

This quickstart is for development only. For production deployment:
  • Use environment-specific configs
  • Enable authentication
  • Set up proper logging and monitoring
  • Use managed Postgres/MongoDB for persistence
  • Configure CORS properly
  • See Deployment Guide for details

Get Help

If you’re stuck:
  • Documentation: Browse the full docs at /docs
  • GitHub Issues: Report bugs at github.com/your-org/jarvis/issues
  • Logs: Check /tmp/jarvis_backend.log for detailed error traces
  • Health Check: curl http://localhost:8000/api/health to verify service status
Performance tip: The first capture initializes models and can take 30-60 seconds. Subsequent captures are much faster (5-15 seconds).

Build docs developers (and LLMs) love