Chat Configuration

The configure method allows you to customize how your LLM handles conversations, manages context, and generates responses.

Configuration Overview

llm.configure({
  chatConfig: { /* ... */ },
  toolsConfig: { /* ... */ },
  generationConfig: { /* ... */ },
});

Chat Configuration

The chatConfig object manages conversation behavior and context handling.

System Prompt

Define the model’s personality and instructions:

llm.configure({
  chatConfig: {
    systemPrompt: 'You are a helpful AI assistant specializing in React Native development.',
  },
});

chatConfig.systemPrompt

string

Instructions that define the model’s behavior and personality. This appears at the start of every conversation context.Common use cases:

Setting personality (“You are a friendly tutor”)
Defining expertise (“You are an expert in TypeScript”)
Setting constraints (“Answer in one sentence”)
Formatting output (“Respond in JSON format”)

Initial Message History

Provide conversation context at initialization:

llm.configure({
  chatConfig: {
    systemPrompt: 'You are a helpful assistant.',
    initialMessageHistory: [
      { role: 'user', content: 'Hello!' },
      { role: 'assistant', content: 'Hi! How can I help you today?' },
    ],
  },
});

chatConfig.initialMessageHistory

Message[]

Pre-populate the conversation history. Useful for:

Resuming saved conversations
Providing examples (few-shot prompting)
Setting conversation tone

Context Strategy

Manage how conversation history fits within the model’s context window:

import { SlidingWindowContextStrategy } from 'react-native-executorch/utils';

llm.configure({
  chatConfig: {
    contextStrategy: new SlidingWindowContextStrategy(
      1000, // Buffer 1000 tokens for generation
      false // Don't allow orphaned assistant messages
    ),
  },
});

chatConfig.contextStrategy

ContextStrategy

Strategy for managing the conversation context window. See Context Strategies for detailed information.

Tools Configuration

Enable the model to call external functions. See Tool Calling for complete details.

llm.configure({
  toolsConfig: {
    tools: [/* tool definitions */],
    executeToolCallback: async (call) => {
      // Execute the tool and return result
    },
    displayToolCalls: false,
  },
});

toolsConfig.tools

LLMTool[]

Array of tool definitions. Format depends on your model’s chat template.

toolsConfig.executeToolCallback

function

Async function that receives a ToolCall and returns the result as a string.

executeToolCallback: (call: ToolCall) => Promise<string | null>

toolsConfig.displayToolCalls

boolean

default:"false"

Whether to include JSON tool call representations in the message history. If false, only the final answers are shown.

Tool calling only works if your model’s chat template supports it. Most instruction-tuned models like Llama 3.2 support tool calling.

Generation Configuration

Control how the model generates text:

llm.configure({
  generationConfig: {
    temperature: 0.7,
    topp: 0.9,
    outputTokenBatchSize: 10,
    batchTimeInterval: 100,
  },
});

generationConfig.temperature

number

default:"1.0"

Controls randomness and creativity in generation.

Lower values (0.1-0.5): More focused, deterministic, factual
Medium values (0.6-0.9): Balanced creativity and coherence
Higher values (1.0+): More creative, diverse, potentially less coherent

// For factual Q&A
temperature: 0.3

// For creative writing
temperature: 0.9

generationConfig.topp

number

default:"1.0"

Nucleus sampling parameter. Only samples from tokens whose cumulative probability exceeds this value.

0.9: Recommended for most use cases (good balance)
0.95: Slightly more diverse
1.0: Consider all tokens (no filtering)

topp: 0.9 // Recommended default

generationConfig.outputTokenBatchSize

number

default:"1"

Soft upper limit on tokens per batch for streaming. Higher values reduce update frequency but may improve performance.

// Update every token (smooth but frequent)
outputTokenBatchSize: 1

// Update every 10 tokens (less smooth but more efficient)
outputTokenBatchSize: 10

This is a “soft” limit. In certain cases (like emoji sequences), batches may be larger to avoid breaking combined characters.

generationConfig.batchTimeInterval

number

default:"0"

Maximum time (in milliseconds) between token batch emissions. Works with outputTokenBatchSize.

batchTimeInterval: 100 // Emit at least every 100ms

Complete Configuration Example

import React, { useEffect } from 'react';
import { useLLM } from 'react-native-executorch';
import { LLAMA3_2_3B } from 'react-native-executorch/constants';
import {
  SlidingWindowContextStrategy,
  MessageCountContextStrategy,
} from 'react-native-executorch/utils';

function ConfiguredChat() {
  const llm = useLLM({ model: LLAMA3_2_3B });

  useEffect(() => {
    if (llm.isReady) {
      llm.configure({
        // Chat configuration
        chatConfig: {
          systemPrompt: `You are an expert React Native developer.
          Provide concise, accurate answers with code examples when relevant.
          Always explain your reasoning.`,
          
          // Optional: Provide initial context
          initialMessageHistory: [
            {
              role: 'user',
              content: 'I need help with React Native.',
            },
            {
              role: 'assistant',
              content: 'I\'d be happy to help! What specific aspect of React Native are you working on?',
            },
          ],
          
          // Use sliding window to manage context automatically
          contextStrategy: new SlidingWindowContextStrategy(
            2000, // Reserve 2000 tokens for generation
            false // Ensure user-assistant message pairs stay together
          ),
        },
        
        // Generation configuration
        generationConfig: {
          temperature: 0.7, // Balanced creativity
          topp: 0.9,        // Nucleus sampling
          outputTokenBatchSize: 5, // Update every 5 tokens
          batchTimeInterval: 50,   // Or every 50ms
        },
      });
    }
  }, [llm.isReady]);

  // Rest of component...
}

Dynamic Reconfiguration

You can call configure multiple times to adjust behavior:

// Switch to creative mode
const setCreativeMode = () => {
  llm.configure({
    chatConfig: {
      systemPrompt: 'You are a creative writing assistant.',
    },
    generationConfig: {
      temperature: 1.2,
      topp: 0.95,
    },
  });
};

// Switch to factual mode
const setFactualMode = () => {
  llm.configure({
    chatConfig: {
      systemPrompt: 'You are a precise, factual assistant.',
    },
    generationConfig: {
      temperature: 0.3,
      topp: 0.9,
    },
  });
};

Type Definitions

interface LLMConfig {
  chatConfig?: Partial<ChatConfig>;
  toolsConfig?: ToolsConfig;
  generationConfig?: GenerationConfig;
}

interface ChatConfig {
  initialMessageHistory: Message[];
  systemPrompt: string;
  contextStrategy: ContextStrategy;
}

interface GenerationConfig {
  temperature?: number;
  topp?: number;
  outputTokenBatchSize?: number;
  batchTimeInterval?: number;
}

interface ToolsConfig {
  tools: LLMTool[];
  executeToolCallback: (call: ToolCall) => Promise<string | null>;
  displayToolCalls?: boolean;
}

Best Practices

Always set a system prompt - It helps guide the model’s behavior consistently
Choose appropriate temperature - Lower for factual tasks, higher for creative tasks
Use context strategies - Prevent context overflow errors with SlidingWindowContextStrategy
Configure on mount - Set up configuration in a useEffect after isReady becomes true
Batch token updates - Use outputTokenBatchSize > 1 for better performance in production

Getting Started

Core Concepts

Large Language Models

Computer Vision

Speech & Audio

Text Embeddings

Advanced

Guides

Chat Configuration

Chat Configuration

Configuration Overview

Chat Configuration

System Prompt

Initial Message History

Context Strategy

Tools Configuration

Generation Configuration

Complete Configuration Example

Dynamic Reconfiguration

Type Definitions

Best Practices

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Large Language Models

Computer Vision

Speech & Audio

Text Embeddings

Advanced

Guides

​Chat Configuration

​Configuration Overview

​Chat Configuration

​System Prompt

​Initial Message History

​Context Strategy

​Tools Configuration

​Generation Configuration

​Complete Configuration Example

​Dynamic Reconfiguration

​Type Definitions

​Best Practices

Build docs developers (and LLMs) love

Chat Configuration

Configuration Overview

Chat Configuration

System Prompt

Initial Message History

Context Strategy

Tools Configuration

Generation Configuration

Complete Configuration Example

Dynamic Reconfiguration

Type Definitions

Best Practices