Skip to main content

Observability Stack

JARVIS uses multiple layers of observability:
  • Laminar - LLM call tracing and accuracy verification
  • Loguru - Structured application logging
  • FastAPI - Built-in request/response logging
  • Metrics - Performance and throughput monitoring

Logging

Loguru Configuration

JARVIS uses Loguru for structured logging:
from loguru import logger

# Log at different levels
logger.debug("Processing frame {} for capture {}", frame_idx, capture_id)
logger.info("Face detected with confidence {:.2f}", confidence)
logger.warning("API key not configured for service: {}", service_name)
logger.error("Failed to process capture {}: {}", capture_id, error)

Log Levels

Configure log level via environment variable:
SPECTER_LOG_LEVEL=DEBUG  # DEBUG, INFO, WARNING, ERROR, CRITICAL

Structured Logging

Loguru supports structured logging with context:
logger.bind(capture_id=capture_id, person_id=person_id).info(
    "Pipeline completed: {} faces detected", face_count
)

Log Output

Development mode shows colorized console output:
2026-03-05 14:23:45.123 | INFO     | pipeline:process:142 - Processing capture cap_abc123
2026-03-05 14:23:45.456 | INFO     | identification:detect:67 - Face detected: confidence=0.95
2026-03-05 14:23:46.789 | INFO     | agents:orchestrator:203 - Starting agent swarm for person_xyz789

Laminar Tracing

Laminar provides observability for LLM calls and agent behavior.

Setup

1

Get Laminar API key

Sign up at lmnr.ai and create a project.
2

Configure environment variable

LMNR_PROJECT_API_KEY=your-api-key
3

Initialize Laminar

Laminar is automatically initialized at app startup:
backend/observability/laminar.py
from lmnr import Laminar
from config import get_settings

def initialize_laminar(settings):
    if not settings.laminar_api_key:
        logger.warning("Laminar tracing disabled - no API key")
        return False
    
    try:
        Laminar.initialize(project_api_key=settings.laminar_api_key)
        logger.info("Laminar tracing initialized")
        return True
    except Exception as exc:
        logger.error("Failed to initialize Laminar: {}", exc)
        return False

Tracing Functions

Use the @observe decorator to trace functions:
from lmnr import observe

@observe()
async def identify_person(image_bytes: bytes) -> dict:
    """Traced: PimEyes search + Vision LLM extraction."""
    # PimEyes search
    results = await pimeyes_search(image_bytes)
    
    # Vision LLM extraction
    identity = await vision_extract(results)
    
    return identity

@observe()
async def synthesize_report(person_id: str) -> dict:
    """Traced: Report synthesis from all intel fragments."""
    fragments = await get_fragments(person_id)
    report = await llm_synthesize(fragments)
    return report

Custom Trace Decorator

JARVIS provides a custom @traced decorator that combines logging and Laminar:
backend/observability/laminar.py
from observability.laminar import traced

@traced(
    name="linkedin_research",
    metadata={"agent": "linkedin"},
    tags=["agent", "research"],
)
async def research_linkedin(person_name: str) -> dict:
    """Research person on LinkedIn."""
    # Automatically logs duration and traces to Laminar
    result = await agent.run()
    return result
The @traced decorator:
  • Logs start and end with duration
  • Sends spans to Laminar (if configured)
  • Handles both sync and async functions
  • Includes metadata and tags for filtering

Viewing Traces

Access the Laminar dashboard at app.lmnr.ai to:
  • View LLM call traces with full prompts and responses
  • Analyze token usage and costs
  • Debug failed requests
  • Track accuracy and hallucinations
  • Monitor agent execution flows
Laminar tracing dashboard

Example Trace

A typical pipeline trace shows:
Capture Pipeline (3.2s)
├── Frame Extraction (0.1s)
├── Face Detection (0.3s)
├── PimEyes Search (1.2s)
│   ├── Browser Navigation (0.8s)
│   └── Screenshot Capture (0.4s)
├── Vision LLM Extraction (0.9s)
│   ├── API Call: gpt-4-vision (0.8s, 1,234 tokens)
│   └── JSON Parsing (0.1s)
└── Agent Swarm (2.1s)
    ├── LinkedIn Agent (1.8s)
    ├── Twitter Agent (1.2s)
    └── Exa Enrichment (0.3s)

Performance Metrics

Built-in Metrics

The /health endpoint includes performance metrics:
curl http://localhost:8000/health
{
  "status": "ok",
  "service": "jarvis",
  "uptime_seconds": 3600,
  "services": {
    "convex": true,
    "openai": true,
    "browser_use": true
  }
}

Custom Metrics

Track custom metrics with the traced decorator:
from observability.laminar import traced
import time

@traced(
    name="face_detection",
    metadata={"model": "mediapipe"},
)
async def detect_faces(image: bytes) -> list:
    start = time.time()
    faces = await detector.detect(image)
    duration = time.time() - start
    
    logger.info(
        "Face detection completed: {} faces in {:.2f}s",
        len(faces), duration
    )
    return faces

Error Tracking

Exception Logging

Loguru automatically captures exception context:
try:
    result = await risky_operation()
except Exception as exc:
    logger.exception("Operation failed for {}", capture_id)
    # Logs full traceback automatically

Error Context

Add context to errors:
logger.bind(
    capture_id=capture_id,
    person_id=person_id,
    agent="linkedin",
).error("Agent failed: {}", error)

Sentry Integration (Optional)

For production deployments, integrate Sentry:
import sentry_sdk

sentry_sdk.init(
    dsn="your-sentry-dsn",
    environment=settings.environment,
    traces_sample_rate=1.0,
)

Debugging Tools

FastAPI Debug Mode

Enable debug mode for development:
app = FastAPI(
    title="JARVIS API",
    debug=True,  # Enable in development
)

Interactive API Docs

FastAPI provides interactive documentation: Use these to:
  • Test API endpoints interactively
  • View request/response schemas
  • Debug authentication issues

Request Logging

Log all incoming requests:
from fastapi import Request
import time

@app.middleware("http")
async def log_requests(request: Request, call_next):
    start = time.time()
    
    logger.info(
        "Request: {} {}",
        request.method,
        request.url.path,
    )
    
    response = await call_next(request)
    duration = time.time() - start
    
    logger.info(
        "Response: {} {} - {} in {:.2f}s",
        request.method,
        request.url.path,
        response.status_code,
        duration,
    )
    
    return response

Production Monitoring

Health Checks

Implement comprehensive health checks:
@app.get("/health")
async def health_check():
    """Comprehensive health check with service status."""
    return {
        "status": "ok",
        "service": "jarvis",
        "timestamp": time.time(),
        "services": settings.service_flags(),
    }

@app.get("/health/ready")
async def readiness_check():
    """Check if service is ready to accept requests."""
    # Check critical dependencies
    if not settings.openai_api_key:
        raise HTTPException(503, "OpenAI not configured")
    
    return {"status": "ready"}

Alerting

Set up alerts for:
  • API errors (>5% error rate)
  • High latency (>5s p95)
  • Service unavailability
  • Rate limit exhaustion

Monitoring Dashboard

Create a monitoring dashboard with:
  1. Request metrics: throughput, latency, errors
  2. LLM metrics: token usage, cost, latency
  3. Agent metrics: success rate, duration
  4. System metrics: CPU, memory, disk

Best Practices

Always use structured fields instead of string interpolation:
# Good
logger.info("Processing {}", capture_id)

# Better
logger.bind(capture_id=capture_id).info("Processing capture")

# Bad
logger.info(f"Processing {capture_id}")
Use @observe or @traced on all functions that:
  • Make LLM API calls
  • Call external services
  • Are part of the critical path
  • Have significant latency
Add relevant IDs to every log message:
logger.bind(
    capture_id=capture_id,
    person_id=person_id,
    frame_idx=frame_idx,
).info("Face detected")
For very frequent operations, sample logs:
if random.random() < 0.1:  # Log 10% of requests
    logger.debug("High-frequency operation")

Troubleshooting

Laminar Not Receiving Traces

  1. Verify API key is set: echo $LMNR_PROJECT_API_KEY
  2. Check initialization logs: grep "Laminar" logs.txt
  3. Test with a simple trace:
    from lmnr import observe
    
    @observe()
    def test():
        return "hello"
    
    test()
    
  4. View traces at app.lmnr.ai

Missing Log Context

If log messages are missing context:
# Wrap operations in context manager
with logger.contextualize(capture_id=capture_id):
    await process_capture()
    # All logs in this block include capture_id

Performance Impact

If observability is impacting performance:
  1. Reduce log level in production: SPECTER_LOG_LEVEL=WARNING
  2. Sample traces: only trace 10% of requests
  3. Disable debug mode: debug=False in FastAPI
  4. Use async logging (Loguru does this by default)

Next Steps

Laminar Tracing

Deep dive into Laminar integration

Performance

Optimize for production performance

Deployment

Deploy with production monitoring

Troubleshooting

Debug common issues

Build docs developers (and LLMs) love