Skip to main content

Overview

JARVIS supports multiple camera input sources with automatic fallback. The system prioritizes Meta Ray-Ban glasses when available, but seamlessly falls back to iPhone camera for testing and development.

Primary Input

Meta Ray-Ban GlassesHands-free POV capture via DAT SDK

Fallback Input

iPhone CameraManual capture using AVFoundation

Camera Architecture

┌─────────────────────────────────────────┐
│         Camera Input Layer              │
│                                         │
│  ┌──────────────┐   ┌───────────────┐  │
│  │ Meta Glasses │   │ iPhone Camera │  │
│  │  (DAT SDK)   │   │ (AVFoundation)│  │
│  └──────┬───────┘   └───────┬───────┘  │
│         │                   │          │
│         └─────────┬─────────┘          │
│                   │                    │
│         ┌─────────▼──────────┐         │
│         │  Frame Handler     │         │
│         │  - Validation      │         │
│         │  - Resize          │         │
│         │  - Format convert  │         │
│         └─────────┬──────────┘         │
│                   │                    │
└───────────────────┼────────────────────┘

         ┌──────────▼──────────┐
         │  Capture Pipeline   │
         │  - Face detection   │
         │  - Identification   │
         └─────────────────────┘

iPhone Camera Manager

The iPhone camera provides a fallback when glasses are unavailable.

Implementation

Create an IPhoneCameraManager class to handle AVFoundation capture:
import AVFoundation
import UIKit

class IPhoneCameraManager: NSObject {
  private let captureSession = AVCaptureSession()
  private let videoOutput = AVCaptureVideoDataOutput()
  private let sessionQueue = DispatchQueue(label: "iphone-camera-session")
  private let context = CIContext()
  private var isRunning = false

  var onFrameCaptured: ((UIImage) -> Void)?

  func start() {
    guard !isRunning else { return }
    sessionQueue.async { [weak self] in
      self?.configureSession()
      self?.captureSession.startRunning()
      self?.isRunning = true
    }
  }

  func stop() {
    guard isRunning else { return }
    sessionQueue.async { [weak self] in
      self?.captureSession.stopRunning()
      self?.isRunning = false
    }
  }
}
View the complete implementation in source/samples/CameraAccess/CameraAccess/iPhone/IPhoneCameraManager.swift

Session Configuration

1

Configure Capture Session

Set up the AVCaptureSession with appropriate presets:
private func configureSession() {
  captureSession.beginConfiguration()
  captureSession.sessionPreset = .medium  // Balance quality & performance

  // Add back camera input
  guard let camera = AVCaptureDevice.default(
    .builtInWideAngleCamera,
    for: .video,
    position: .back
  ) else {
    NSLog("[iPhoneCamera] Failed to access back camera")
    captureSession.commitConfiguration()
    return
  }

  guard let input = try? AVCaptureDeviceInput(device: camera) else {
    return
  }

  if captureSession.canAddInput(input) {
    captureSession.addInput(input)
  }

  captureSession.commitConfiguration()
}
2

Add Video Output

Configure video output with proper pixel format:
// Add video output
videoOutput.videoSettings = [
  kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA
]
videoOutput.setSampleBufferDelegate(self, queue: sessionQueue)
videoOutput.alwaysDiscardsLateVideoFrames = true

if captureSession.canAddOutput(videoOutput) {
  captureSession.addOutput(videoOutput)
}
Set alwaysDiscardsLateVideoFrames = true to prevent frame buffer buildup during processing.
3

Handle Frame Rotation

Force portrait orientation for consistent output:
// Force portrait-oriented frames from the sensor
if let connection = videoOutput.connection(with: .video) {
  if connection.isVideoRotationAngleSupported(90) {
    connection.videoRotationAngle = 90
  }
}
4

Process Video Frames

Implement the AVCaptureVideoDataOutputSampleBufferDelegate:
extension IPhoneCameraManager: AVCaptureVideoDataOutputSampleBufferDelegate {
  func captureOutput(
    _ output: AVCaptureOutput,
    didOutput sampleBuffer: CMSampleBuffer,
    from connection: AVCaptureConnection
  ) {
    guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
      return
    }

    let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
    guard let cgImage = context.createCGImage(ciImage, from: ciImage.extent) else {
      return
    }
    let image = UIImage(cgImage: cgImage)

    onFrameCaptured?(image)
  }
}

Camera Permissions

Request camera access before starting capture:
static func requestPermission() async -> Bool {
  let status = AVCaptureDevice.authorizationStatus(for: .video)
  switch status {
  case .authorized:
    return true
  case .notDetermined:
    return await AVCaptureDevice.requestAccess(for: .video)
  default:
    return false
  }
}

Usage Example

let hasPermission = await IPhoneCameraManager.requestPermission()
if hasPermission {
  cameraManager.start()
} else {
  // Show settings alert
  showCameraPermissionAlert()
}

Backend Integration

Images are sent to the JARVIS backend through multiple channels:

Webhook Endpoint

The primary method for real-time uploads:
# webhook.py
@router.post("/webhook", response_model=CaptureResponse)
async def capture_webhook(body: WebhookRequest) -> CaptureResponse:
    """Accept a base64-encoded image and run it through the pipeline."""
    import base64

    pipeline = _get_pipeline()
    capture_id = f"cap_{uuid4().hex[:12]}"

    data = base64.b64decode(body.image_base64)
    logger.info("Webhook capture {} from source={}, {} bytes",
                capture_id, body.source, len(data))

    result = await pipeline.process(
        capture_id=capture_id,
        data=data,
        content_type="image/jpeg",
        source=body.source,
    )

    return CaptureResponse(
        capture_id=capture_id,
        status="processed" if result.success else "error",
        total_frames=result.total_frames,
        faces_detected=result.faces_detected,
        persons_created=list(result.persons_created)
    )
View implementation in source/backend/capture/webhook.py:59

URL Import Endpoint

For importing images from external URLs:
@router.post("/url", response_model=CaptureResponse)
async def capture_url(body: UrlRequest) -> CaptureResponse:
    """Download an image from a URL and run it through the pipeline."""
    pipeline = _get_pipeline()
    capture_id = f"cap_{uuid4().hex[:12]}"

    async with httpx.AsyncClient(timeout=30.0) as client:
        resp = await client.get(body.url)
        resp.raise_for_status()
        data = resp.content
        content_type = resp.headers.get("content-type", "image/jpeg")

    result = await pipeline.process(
        capture_id=capture_id,
        data=data,
        content_type=content_type,
        source=body.source,
    )

    return CaptureResponse(
        capture_id=capture_id,
        status="processed" if result.success else "error",
        total_frames=result.total_frames,
        faces_detected=result.faces_detected,
        persons_created=list(result.persons_created)
    )
View implementation in source/backend/capture/webhook.py:95

Capture Service

The capture service processes uploads through the pipeline:
class CaptureService:
    """Process incoming media through the face detection pipeline."""

    def __init__(self, pipeline: CapturePipeline | None = None) -> None:
        self._pipeline = pipeline

    async def enqueue_upload(
        self,
        file: UploadFile,
        source: str = "manual_upload",
        person_name: str | None = None,
    ) -> CaptureQueuedResponse | dict:
        capture_id = f"cap_{uuid4().hex[:12]}"
        data = await file.read()

        result = await self._pipeline.process(
            capture_id=capture_id,
            data=data,
            content_type=file.content_type or "application/octet-stream",
            source=source,
            person_name=person_name,
        )

        return {
            "capture_id": capture_id,
            "status": "processed" if result.success else "error",
            "total_frames": result.total_frames,
            "faces_detected": result.faces_detected,
            "persons_created": result.persons_created,
        }
View implementation in source/backend/capture/service.py:26

Configuration Reference

Session Presets

PresetResolutionUse CasePerformance
.low480pTesting, low bandwidthHigh FPS
.medium720pRecommendedBalanced
.high1080pMaximum qualityLower FPS
.photoFull resolutionStill images onlyN/A

Video Settings

let videoSettings: [String: Any] = [
  kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32BGRA,
  kCVPixelBufferWidthKey as String: 1280,
  kCVPixelBufferHeightKey as String: 720
]
// For glasses integration
captureSession.sessionPreset = .medium
videoOutput.alwaysDiscardsLateVideoFrames = true
connection.videoRotationAngle = 90

// For iPhone fallback
captureSession.sessionPreset = .high
videoOutput.alwaysDiscardsLateVideoFrames = false  // Process all frames

Testing Camera Setup

Unit Test Example

import XCTest
@testable import CameraAccess

class IPhoneCameraManagerTests: XCTestCase {
  var cameraManager: IPhoneCameraManager!

  override func setUp() {
    super.setUp()
    cameraManager = IPhoneCameraManager()
  }

  func testFrameCapture() async throws {
    let expectation = XCTestExpectation(description: "Capture frame")

    cameraManager.onFrameCaptured = { image in
      XCTAssertNotNil(image)
      XCTAssertGreaterThan(image.size.width, 0)
      expectation.fulfill()
    }

    cameraManager.start()
    await fulfillment(of: [expectation], timeout: 5.0)
    cameraManager.stop()
  }
}

Troubleshooting

Symptoms: Black screen or no frames capturedSolutions:
  • Check camera permissions: Settings → Privacy → Camera
  • Verify camera is not in use by another app
  • Restart the app
  • Check device logs: NSLog statements in console
  • Ensure captureSession.startRunning() is called on background queue
Symptoms: Stuttering video or missed framesSolutions:
  • Set alwaysDiscardsLateVideoFrames = true
  • Lower session preset to .medium or .low
  • Optimize frame processing callback
  • Check CPU usage in Instruments
  • Reduce frame processing frequency
Symptoms: Sideways or upside-down imagesSolutions:
  • Set videoRotationAngle = 90 for portrait
  • Check device orientation handling
  • Verify connection orientation support:
    if connection.isVideoRotationAngleSupported(90) {
      connection.videoRotationAngle = 90
    }
    
Symptoms: Images captured but not reaching backendSolutions:
  • Verify backend URL is accessible
  • Check network connectivity
  • Test webhook with curl:
    curl -X POST https://your-backend.com/api/capture/webhook \
      -H "Content-Type: application/json" \
      -d '{"image_base64":"...","source":"test"}'
    
  • Check backend logs for errors
  • Verify image is properly base64-encoded

Next Steps

Meta Glasses

Set up Meta Ray-Ban smart glasses

Telegram Bot

Configure Telegram bot for remote capture

Reference

Build docs developers (and LLMs) love