Architecture Overview

System Architecture

Off Grid is a React Native mobile app with native modules for on-device AI inference. The architecture follows a layered design separating UI, services, native bridges, and hardware acceleration.

┌──────────────────────────────────────────────────────────────────┐
│                       React Native UI Layer                       │
│            (Brutalist Design System - TypeScript/TSX)            │
├──────────────────────────────────────────────────────────────────┤
│                  TypeScript Services Layer                        │
│                                                                   │
│   Core Services (background-safe singletons):                    │
│   ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐│
│   │   llmService    │  │  whisperService │  │ hardware        ││
│   │  (llama.rn)     │  │  (whisper.rn)   │  │  (RAM/CPU info) ││
│   └─────────────────┘  └─────────────────┘  └─────────────────┘│
│                                                                   │
│   Orchestration Services (lifecycle-independent):                │
│   ┌───────────────────────┐  ┌───────────────────────┐          │
│   │  generationService    │  │ imageGenerationService│          │
│   │  (text, background)   │  │  (images, background) │          │
│   └───────────────────────┘  └───────────────────────┘          │
│                                                                   │
│   Management Services:                                           │
│   ┌────────────────────┐  ┌────────────────────┐                │
│   │activeModelService  │  │   modelManager     │                │
│   │(singleton, mem mgmt)│  │(download, storage) │                │
│   └────────────────────┘  └────────────────────┘                │
├──────────────────────────────────────────────────────────────────┤
│                 Native Module Bridge (JNI / ObjC)                 │
├──────────────────────────────────────────────────────────────────┤
│   Native Implementations:                                         │
│                                                                   │
│   Cross-platform:                                                 │
│   ┌──────────────┐  ┌──────────────┐                             │
│   │   llama.rn   │  │  whisper.rn  │                             │
│   │ (C++ native) │  │ (C++ native) │                             │
│   └──────────────┘  └──────────────┘                             │
│                                                                   │
│   Android:                           iOS:                         │
│   ┌──────────┐ ┌───────────────┐    ┌──────────────────────┐    │
│   │local-dream│ │DownloadManager│    │CoreMLDiffusionModule │    │
│   │(C++/MNN)  │ │   (Kotlin)    │    │(StableDiffusionPipe) │    │
│   └──────────┘ └───────────────┘    └──────────────────────┘    │
├──────────────────────────────────────────────────────────────────┤
│   Hardware Acceleration:                                          │
│   Android:                           iOS:                         │
│   ┌──────────────────┐              ┌──────────────────┐         │
│   │OpenCL (Adreno GPU)│              │ ANE (Neural Engine)│         │
│   │  Text LLMs only   │              │  Image gen + LLMs │         │
│   ├──────────────────┤              ├──────────────────┤         │
│   │   QNN (NPU)      │              │  Metal (GPU)     │         │
│   │  Image gen only  │              │  LLM inference   │         │
│   └──────────────────┘              └──────────────────┘         │
└──────────────────────────────────────────────────────────────────┘

Layer Breakdown

UI Layer (React Native)

Framework: React Native 0.83 with TypeScript 5.x Design System: Brutalist terminal-inspired interface with monochromatic palette and emerald accents. Full light/dark theme support via useTheme() hook. Navigation: React Navigation 7.x with bottom tabs and modal stacks. Animations: react-native-reanimated for spring-based physics and staggered entrance effects. Key Components:

ChatInput — Message composition with attachment badges (src/components/ChatInput.tsx)
ModelCard — Model display with actions (src/components/ModelCard.tsx)
AnimatedPressable — Spring scale feedback + haptics (src/components/AnimatedPressable.tsx)
AppSheet — Custom swipe-dismissible bottom sheets (src/components/AppSheet.tsx)

Services Layer (TypeScript)

All core services are singleton instances to prevent duplicate model loading, concurrent inference conflicts, and memory leaks.

Core Services

llmService (src/services/llm.ts:26)
Wraps llama.rn for GGUF model lifecycle and text/vision inference:

Model loading with automatic context scaling based on device RAM
Streaming token generation with 50ms batched UI updates
Vision model support via mmproj (multimodal projector) files
Tool calling detection from jinja chat templates
KV cache management with quantization (f16/q8_0/q4_0)

whisperService
Wraps whisper.rn for speech-to-text transcription with multiple model sizes (Tiny, Base, Small). hardwareService (src/services/hardware.ts)
Device info retrieval: RAM, CPU cores, SoC model, storage.

Orchestration Services

generationService (src/services/generationService.ts:29)
Background-safe text generation orchestration:

Maintains generation state independently of React component lifecycle
Token batching: collects tokens and flushes to UI every 50ms
Message queue: non-blocking input during active generation
Tool loop integration (max 3 iterations, 5 tool calls)

class GenerationService {
  private state: GenerationState = { isGenerating: false, ... };
  private listeners: Set<GenerationListener> = new Set();

  subscribe(listener: GenerationListener): () => void {
    this.listeners.add(listener);
    listener(this.getState()); // Immediate state delivery
    return () => this.listeners.delete(listener);
  }

  private notifyListeners(): void {
    const state = this.getState();
    this.listeners.forEach(listener => listener(state));
  }
}

imageGenerationService (src/services/imageGenerationService.ts:111)
Background-safe image generation with progressive preview:

Continues generation when screens unmount
Real-time step progress (1-50 steps)
Preview images every N steps
Optional LLM-based prompt enhancement

Management Services

activeModelService (src/services/activeModelService/index.ts:29)
Singleton for safe model loading/unloading:

Guards against concurrent loads with promise deduplication
Pre-load memory checks (60% RAM budget enforcement)
Automatic model unload when switching models
Synchronization with native state on app resume

async loadTextModel(modelId: string, timeoutMs: number = 120000): Promise<void> {
  // Already loaded natively — ensure store reflects it
  if (this.loadedTextModelId === modelId && llmService.isModelLoaded()) {
    const store = useAppStore.getState();
    if (store.activeModelId !== modelId) { 
      store.setActiveModelId(modelId); 
    }
    return;
  }

  // Wait for in-flight load to complete
  if (this.textLoadPromise !== null) {
    await this.textLoadPromise;
    if (this.loadedTextModelId === modelId) {
      const store = useAppStore.getState();
      if (store.activeModelId !== modelId) { 
        store.setActiveModelId(modelId); 
      }
      return;
    }
  }

  // Proceed with load...
  this.loadingState.text = true;
  this.notifyListeners();
  this.textLoadPromise = doLoadTextModel({ /* ... */ });
  await this.textLoadPromise;
}

Native Bridge Layer

JNI (Android) and Objective-C (iOS) bindings connect TypeScript to native C++ and Swift modules.

Native Module Layer

Cross-Platform

llama.rn — llama.cpp compiled for ARM64 with GPU acceleration:

Android: OpenCL (Adreno GPUs), NEON/i8mm SIMD
iOS: Metal GPU, Neural Engine for vision models
Multimodal support via mmproj (CLIP vision encoder)

whisper.rn — whisper.cpp for real-time audio transcription.

Android-Only

LocalDreamModule (android/app/src/main/java/ai/offgridmobile/localdream/LocalDreamModule.kt:32)
Stable Diffusion via local-dream C++ library:

MNN backend: CPU inference (all ARM64 devices)
QNN backend: Qualcomm NPU (Snapdragon 8 Gen 1+)
Subprocess architecture: spawns HTTP server on localhost:18081
Automatic backend selection with CPU fallback

DownloadManagerModule (android/app/src/main/java/ai/offgridmobile/download/DownloadManagerModule.kt:17)
Native Android DownloadManager wrapper:

Background downloads with system notifications
Progress polling (500ms intervals)
Persistent download tracking via SharedPreferences
Cleanup of completed/stale downloads

PDFExtractorModule (android/app/src/main/java/ai/offgridmobile/pdf/PDFExtractorModule.kt:11)
Text extraction using PdfiumAndroid library with page-by-page processing.

iOS-Only

CoreMLDiffusionModule (ios/CoreMLDiffusionModule.swift:11)
Stable Diffusion via Apple’s ml-stable-diffusion pipeline:

Neural Engine (ANE) + CPU compute units
DPM-Solver scheduler for faster convergence
Supports both SD 1.5/2.1 and SDXL models
Palettized (6-bit) and full-precision (fp16) models

PDFExtractorModule (ios/OffgridMobile/PDFExtractor/PDFExtractorModule.swift:4)
Text extraction using Apple’s PDFKit framework.

Technology Stack

Layer	Technologies
UI Framework	React Native 0.83, TypeScript 5.x
State Management	Zustand 5.x with AsyncStorage persistence
Navigation	React Navigation 7.x
Animations	React Native Reanimated 4.x, haptic feedback
Text Inference	llama.cpp via llama.rn (GGUF format)
Vision Inference	llama.cpp multimodal (mmproj)
Voice Transcription	whisper.cpp via whisper.rn
Image Generation (Android)	local-dream (MNN/QNN backends)
Image Generation (iOS)	ml-stable-diffusion (Core ML)
PDF Extraction (Android)	PdfiumAndroid
PDF Extraction (iOS)	PDFKit
File Picker	@react-native-documents/picker
Document Viewer	@react-native-documents/viewer

Data Flow

Text Generation Flow

User Input → ChatScreen collects message + attachments
Service Call → generationService.generateResponse() or .generateWithTools()
Context Management → llmService passes all messages to llama.rn (no JS truncation)
Native Inference → llama.cpp streams tokens via callback
Token Batching → generationService buffers tokens, flushes every 50ms
UI Update → chatStore updates streaming message
Completion → Finalize message with metadata (tok/s, TTFT, generation time)

Vision Inference Flow

Image Attachment → User attaches photo from camera/library
mmproj Check → llmService verifies multimodal initialized
OAI Message Format → Convert to OpenAI-compatible message with image URIs
CLIP Encoding → Native CLIP processes image to embeddings
LLM Processing → llama.cpp merges text + vision embeddings
Response → Stream tokens as normal text generation

Image Generation Flow (Android)

Prompt Input → User sends text or enables image mode toggle
Intent Detection → Pattern matching or LLM-based classification
Prompt Enhancement → Optional: LLM expands prompt (“a dog” → detailed 75-word description)
Model Load → LocalDreamModule starts subprocess server (MNN or QNN)
HTTP Request → TypeScript sends POST to localhost:18081/generate
SSE Stream → Server sends progress events (step/totalSteps) + preview images
RGB → PNG → Native code decodes base64 RGB, converts to PNG
Gallery Save → Image stored in app files, added to gallery

Image Generation Flow (iOS)

Prompt Input → Same as Android
Pipeline Load → CoreMLDiffusionModule loads StableDiffusionPipeline
Generation → Native Swift calls pipeline.generateImages() with progress callback
ANE Acceleration → Neural Engine processes UNet denoising steps
PNG Save → CGImage converted to PNG, stored in documents
Gallery Save → Same as Android

Design Patterns

See System Design for detailed patterns:

Singleton services
Background-safe orchestration
Memory-first loading strategy
Combined asset tracking (vision models + mmproj)
State cleanup patterns

Architecture

Platform Details

Performance

Architecture Overview

System Architecture

Layer Breakdown

UI Layer (React Native)

Services Layer (TypeScript)

Core Services

Orchestration Services

Management Services

Native Bridge Layer

Native Module Layer

Cross-Platform

Android-Only

iOS-Only

Technology Stack

Data Flow

Text Generation Flow

Vision Inference Flow

Image Generation Flow (Android)

Image Generation Flow (iOS)

Design Patterns

Build docs developers (and LLMs) love

Architecture

Platform Details

Performance

​System Architecture

​Layer Breakdown

​UI Layer (React Native)

​Services Layer (TypeScript)

​Core Services

​Orchestration Services

​Management Services

​Native Bridge Layer

​Native Module Layer

​Cross-Platform

​Android-Only

​iOS-Only

​Technology Stack

​Data Flow

​Text Generation Flow

​Vision Inference Flow

​Image Generation Flow (Android)

​Image Generation Flow (iOS)

​Design Patterns

Build docs developers (and LLMs) love

System Architecture

Layer Breakdown

UI Layer (React Native)

Services Layer (TypeScript)

Core Services

Orchestration Services

Management Services

Native Bridge Layer

Native Module Layer

Cross-Platform

Android-Only

iOS-Only

Technology Stack

Data Flow

Text Generation Flow

Vision Inference Flow

Image Generation Flow (Android)

Image Generation Flow (iOS)

Design Patterns