Camera Setup

Overview

JARVIS supports multiple camera input sources with automatic fallback. The system prioritizes Meta Ray-Ban glasses when available, but seamlessly falls back to iPhone camera for testing and development.

Primary Input

Meta Ray-Ban GlassesHands-free POV capture via DAT SDK

Fallback Input

iPhone CameraManual capture using AVFoundation

Camera Architecture

┌─────────────────────────────────────────┐
│         Camera Input Layer              │
│                                         │
│  ┌──────────────┐   ┌───────────────┐  │
│  │ Meta Glasses │   │ iPhone Camera │  │
│  │  (DAT SDK)   │   │ (AVFoundation)│  │
│  └──────┬───────┘   └───────┬───────┘  │
│         │                   │          │
│         └─────────┬─────────┘          │
│                   │                    │
│         ┌─────────▼──────────┐         │
│         │  Frame Handler     │         │
│         │  - Validation      │         │
│         │  - Resize          │         │
│         │  - Format convert  │         │
│         └─────────┬──────────┘         │
│                   │                    │
└───────────────────┼────────────────────┘
                    │
         ┌──────────▼──────────┐
         │  Capture Pipeline   │
         │  - Face detection   │
         │  - Identification   │
         └─────────────────────┘

iPhone Camera Manager

The iPhone camera provides a fallback when glasses are unavailable.

Implementation

Create an IPhoneCameraManager class to handle AVFoundation capture:

import AVFoundation
import UIKit

class IPhoneCameraManager: NSObject {
  private let captureSession = AVCaptureSession()
  private let videoOutput = AVCaptureVideoDataOutput()
  private let sessionQueue = DispatchQueue(label: "iphone-camera-session")
  private let context = CIContext()
  private var isRunning = false

  var onFrameCaptured: ((UIImage) -> Void)?

  func start() {
    guard !isRunning else { return }
    sessionQueue.async { [weak self] in
      self?.configureSession()
      self?.captureSession.startRunning()
      self?.isRunning = true
    }
  }

  func stop() {
    guard isRunning else { return }
    sessionQueue.async { [weak self] in
      self?.captureSession.stopRunning()
      self?.isRunning = false
    }
  }
}

View the complete implementation in source/samples/CameraAccess/CameraAccess/iPhone/IPhoneCameraManager.swift

Session Configuration

Configure Capture Session

Set up the AVCaptureSession with appropriate presets:

private func configureSession() {
  captureSession.beginConfiguration()
  captureSession.sessionPreset = .medium  // Balance quality & performance

  // Add back camera input
  guard let camera = AVCaptureDevice.default(
    .builtInWideAngleCamera,
    for: .video,
    position: .back
  ) else {
    NSLog("[iPhoneCamera] Failed to access back camera")
    captureSession.commitConfiguration()
    return
  }

  guard let input = try? AVCaptureDeviceInput(device: camera) else {
    return
  }

  if captureSession.canAddInput(input) {
    captureSession.addInput(input)
  }

  captureSession.commitConfiguration()
}

Add Video Output

Configure video output with proper pixel format:

// Add video output
videoOutput.videoSettings = [
  kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA
]
videoOutput.setSampleBufferDelegate(self, queue: sessionQueue)
videoOutput.alwaysDiscardsLateVideoFrames = true

if captureSession.canAddOutput(videoOutput) {
  captureSession.addOutput(videoOutput)
}

Set alwaysDiscardsLateVideoFrames = true to prevent frame buffer buildup during processing.

Handle Frame Rotation

Force portrait orientation for consistent output:

// Force portrait-oriented frames from the sensor
if let connection = videoOutput.connection(with: .video) {
  if connection.isVideoRotationAngleSupported(90) {
    connection.videoRotationAngle = 90
  }
}

Process Video Frames

Implement the AVCaptureVideoDataOutputSampleBufferDelegate:

extension IPhoneCameraManager: AVCaptureVideoDataOutputSampleBufferDelegate {
  func captureOutput(
    _ output: AVCaptureOutput,
    didOutput sampleBuffer: CMSampleBuffer,
    from connection: AVCaptureConnection
  ) {
    guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
      return
    }

    let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
    guard let cgImage = context.createCGImage(ciImage, from: ciImage.extent) else {
      return
    }
    let image = UIImage(cgImage: cgImage)

    onFrameCaptured?(image)
  }
}

Camera Permissions

Request camera access before starting capture:

static func requestPermission() async -> Bool {
  let status = AVCaptureDevice.authorizationStatus(for: .video)
  switch status {
  case .authorized:
    return true
  case .notDetermined:
    return await AVCaptureDevice.requestAccess(for: .video)
  default:
    return false
  }
}

Usage Example

let hasPermission = await IPhoneCameraManager.requestPermission()
if hasPermission {
  cameraManager.start()
} else {
  // Show settings alert
  showCameraPermissionAlert()
}

Backend Integration

Images are sent to the JARVIS backend through multiple channels:

Webhook Endpoint

The primary method for real-time uploads:

# webhook.py
@router.post("/webhook", response_model=CaptureResponse)
async def capture_webhook(body: WebhookRequest) -> CaptureResponse:
    """Accept a base64-encoded image and run it through the pipeline."""
    import base64

    pipeline = _get_pipeline()
    capture_id = f"cap_{uuid4().hex[:12]}"

    data = base64.b64decode(body.image_base64)
    logger.info("Webhook capture {} from source={}, {} bytes",
                capture_id, body.source, len(data))

    result = await pipeline.process(
        capture_id=capture_id,
        data=data,
        content_type="image/jpeg",
        source=body.source,
    )

    return CaptureResponse(
        capture_id=capture_id,
        status="processed" if result.success else "error",
        total_frames=result.total_frames,
        faces_detected=result.faces_detected,
        persons_created=list(result.persons_created)
    )

View implementation in source/backend/capture/webhook.py:59

URL Import Endpoint

For importing images from external URLs:

@router.post("/url", response_model=CaptureResponse)
async def capture_url(body: UrlRequest) -> CaptureResponse:
    """Download an image from a URL and run it through the pipeline."""
    pipeline = _get_pipeline()
    capture_id = f"cap_{uuid4().hex[:12]}"

    async with httpx.AsyncClient(timeout=30.0) as client:
        resp = await client.get(body.url)
        resp.raise_for_status()
        data = resp.content
        content_type = resp.headers.get("content-type", "image/jpeg")

    result = await pipeline.process(
        capture_id=capture_id,
        data=data,
        content_type=content_type,
        source=body.source,
    )

    return CaptureResponse(
        capture_id=capture_id,
        status="processed" if result.success else "error",
        total_frames=result.total_frames,
        faces_detected=result.faces_detected,
        persons_created=list(result.persons_created)
    )

View implementation in source/backend/capture/webhook.py:95

Capture Service

The capture service processes uploads through the pipeline:

class CaptureService:
    """Process incoming media through the face detection pipeline."""

    def __init__(self, pipeline: CapturePipeline | None = None) -> None:
        self._pipeline = pipeline

    async def enqueue_upload(
        self,
        file: UploadFile,
        source: str = "manual_upload",
        person_name: str | None = None,
    ) -> CaptureQueuedResponse | dict:
        capture_id = f"cap_{uuid4().hex[:12]}"
        data = await file.read()

        result = await self._pipeline.process(
            capture_id=capture_id,
            data=data,
            content_type=file.content_type or "application/octet-stream",
            source=source,
            person_name=person_name,
        )

        return {
            "capture_id": capture_id,
            "status": "processed" if result.success else "error",
            "total_frames": result.total_frames,
            "faces_detected": result.faces_detected,
            "persons_created": result.persons_created,
        }

View implementation in source/backend/capture/service.py:26

Configuration Reference

Session Presets

Preset	Resolution	Use Case	Performance
`.low`	480p	Testing, low bandwidth	High FPS
`.medium`	720p	Recommended	Balanced
`.high`	1080p	Maximum quality	Lower FPS
`.photo`	Full resolution	Still images only	N/A

Video Settings

let videoSettings: [String: Any] = [
  kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA,
  kCVPixelBufferWidthKey as String: 1280,
  kCVPixelBufferHeightKey as String: 720
]

Recommended Settings

// For glasses integration
captureSession.sessionPreset = .medium
videoOutput.alwaysDiscardsLateVideoFrames = true
connection.videoRotationAngle = 90

// For iPhone fallback
captureSession.sessionPreset = .high
videoOutput.alwaysDiscardsLateVideoFrames = false  // Process all frames

Testing Camera Setup

Unit Test Example

import XCTest
@testable import CameraAccess

class IPhoneCameraManagerTests: XCTestCase {
  var cameraManager: IPhoneCameraManager!

  override func setUp() {
    super.setUp()
    cameraManager = IPhoneCameraManager()
  }

  func testFrameCapture() async throws {
    let expectation = XCTestExpectation(description: "Capture frame")

    cameraManager.onFrameCaptured = { image in
      XCTAssertNotNil(image)
      XCTAssertGreaterThan(image.size.width, 0)
      expectation.fulfill()
    }

    cameraManager.start()
    await fulfillment(of: [expectation], timeout: 5.0)
    cameraManager.stop()
  }
}

Troubleshooting

Camera not starting

Symptoms: Black screen or no frames capturedSolutions:

Check camera permissions: Settings → Privacy → Camera
Verify camera is not in use by another app
Restart the app
Check device logs: NSLog statements in console
Ensure captureSession.startRunning() is called on background queue

Frames dropping or laggy

Symptoms: Stuttering video or missed framesSolutions:

Set alwaysDiscardsLateVideoFrames = true
Lower session preset to .medium or .low
Optimize frame processing callback
Check CPU usage in Instruments
Reduce frame processing frequency

Incorrect orientation

Symptoms: Sideways or upside-down imagesSolutions:

Set videoRotationAngle = 90 for portrait
Check device orientation handling

Verify connection orientation support:

if connection.isVideoRotationAngleSupported(90) {
  connection.videoRotationAngle = 90
}

Upload failures

Symptoms: Images captured but not reaching backendSolutions:

Verify backend URL is accessible
Check network connectivity

Test webhook with curl:

curl -X POST https://your-backend.com/api/capture/webhook \
  -H "Content-Type: application/json" \
  -d '{"image_base64":"...","source":"test"}'

Check backend logs for errors
Verify image is properly base64-encoded

Next Steps

Meta Glasses

Set up Meta Ray-Ban smart glasses

Telegram Bot

Configure Telegram bot for remote capture

Get Started

Core Concepts

Hardware Integration

Backend Services

Agent System

Frontend

Data & Storage

Deployment

Overview

Primary Input

Fallback Input

Camera Architecture

iPhone Camera Manager

Implementation

Session Configuration

Camera Permissions

Usage Example

Backend Integration

Webhook Endpoint

URL Import Endpoint

Capture Service

Configuration Reference

Session Presets

Video Settings

Recommended Settings

Testing Camera Setup

Unit Test Example

Troubleshooting

Next Steps

Meta Glasses

Telegram Bot

Reference

Build docs developers (and LLMs) love

Get Started

Core Concepts

Hardware Integration

Backend Services

Agent System

Frontend

Data & Storage

Deployment

​Overview

Primary Input

Fallback Input

​Camera Architecture

​iPhone Camera Manager

​Implementation

​Session Configuration

​Camera Permissions

​Usage Example

​Backend Integration

​Webhook Endpoint

​URL Import Endpoint

​Capture Service

​Configuration Reference

​Session Presets

​Video Settings

​Recommended Settings

​Testing Camera Setup

​Unit Test Example

​Troubleshooting

​Next Steps

Meta Glasses

Telegram Bot

​Reference

Build docs developers (and LLMs) love

Overview

Camera Architecture

iPhone Camera Manager

Implementation

Session Configuration

Camera Permissions

Usage Example

Backend Integration

Webhook Endpoint

URL Import Endpoint

Capture Service

Configuration Reference

Session Presets

Video Settings

Recommended Settings

Testing Camera Setup

Unit Test Example

Troubleshooting

Next Steps

Reference