LLMModule - React Native ExecuTorch

Overview

LLMModule provides a class-based interface for managing Large Language Model instances. It handles model loading, message management, text generation, and conversation context.

When to Use

Use LLMModule when:

You need fine-grained control over the model lifecycle
You’re working outside React components
You need to manage multiple model instances programmatically
You want to integrate LLM capabilities into non-React code

Use useLLM hook when:

Building React components
You want automatic lifecycle management
You prefer declarative state management
You need React state integration

Constructor

new LLMModule({
  tokenCallback?: (token: string) => void;
  messageHistoryCallback?: (messageHistory: Message[]) => void;
})

Creates a new LLM module instance with optional callbacks.

Parameters

tokenCallback

(token: string) => void

Optional function called on every generated token with that token as its argument.

messageHistoryCallback

(messageHistory: Message[]) => void

Optional function called on every finished message, returning the entire message history.

Example

import { LLMModule } from 'react-native-executorch';

const llm = new LLMModule({
  tokenCallback: (token) => {
    console.log('New token:', token);
  },
  messageHistoryCallback: (history) => {
    console.log('Updated history:', history);
  }
});

Methods

load()

async load(
  model: {
    modelSource: ResourceSource;
    tokenizerSource: ResourceSource;
    tokenizerConfigSource: ResourceSource;
  },
  onDownloadProgressCallback?: (progress: number) => void
): Promise<void>

Loads the LLM model and tokenizer.

Parameters

model.modelSource

ResourceSource

required

Resource location of the model binary.

model.tokenizerSource

ResourceSource

required

Resource pointing to the tokenizer JSON file.

model.tokenizerConfigSource

ResourceSource

required

Resource pointing to the tokenizer config JSON file.

onDownloadProgressCallback

(progress: number) => void

Optional callback to track download progress (value between 0 and 1).

Example

await llm.load({
  modelSource: 'https://example.com/model.pte',
  tokenizerSource: 'https://example.com/tokenizer.json',
  tokenizerConfigSource: 'https://example.com/tokenizer_config.json'
}, (progress) => {
  console.log(`Download progress: ${(progress * 100).toFixed(1)}%`);
});

configure()

configure(config: LLMConfig): void

Configures chat, tool calling, and generation settings.

Parameters

config

LLMConfig

required

Configuration object containing chatConfig, toolsConfig, and generationConfig.

Example

llm.configure({
  chatConfig: {
    systemPrompt: 'You are a helpful assistant.'
  },
  generationConfig: {
    temperature: 0.7,
    topP: 0.9,
    maxTokens: 512
  }
});

forward()

async forward(input: string): Promise<string>

Runs model inference with raw input string. You need to provide the entire conversation and prompt (in correct format with special tokens). This method doesn’t manage conversation context.

Parameters

input

string

required

Raw input string containing the prompt and conversation history.

Returns

The generated response as a string.

Example

const response = await llm.forward('<|begin_of_text|>Hello, how are you?<|eot_id|>');
console.log(response);

generate()

async generate(messages: Message[], tools?: LLMTool[]): Promise<string>

Runs the model to complete the chat passed in the messages argument. It doesn’t manage conversation context.

Parameters

messages

Message[]

required

Array of messages representing the chat history.

tools

LLMTool[]

Optional array of tools that can be used during generation.

Returns

The generated response as a string.

Example

const response = await llm.generate([
  { role: 'user', content: 'What is the capital of France?' }
]);
console.log(response); // "The capital of France is Paris."

sendMessage()

async sendMessage(message: string): Promise<Message[]>

Adds a user message to the conversation. After the model responds, it calls messageHistoryCallback() with both the user message and model response.

Parameters

message

string

required

The message string to send.

Returns

Updated message history including the new user message and model response.

Example

const history = await llm.sendMessage('Tell me a joke');
console.log(history);
// [
//   { role: 'user', content: 'Tell me a joke' },
//   { role: 'assistant', content: 'Why did the chicken cross...' }
// ]

deleteMessage()

deleteMessage(index: number): Message[]

Deletes all messages starting with the message at the specified index position. After deletion, it calls messageHistoryCallback() with the new history.

Parameters

index

number

required

The index of the message to delete from history.

Returns

Updated message history after deletion.

Example

const newHistory = llm.deleteMessage(2);
console.log(newHistory); // History with messages from index 2 onwards removed

interrupt()

interrupt(): void

Interrupts model generation. May return one more token after interruption.

Example

llm.interrupt();

setTokenCallback()

setTokenCallback({ tokenCallback }: { tokenCallback: (token: string) => void }): void

Sets a new token callback invoked on every token batch.

Parameters

tokenCallback

(token: string) => void

required

Callback function to handle new tokens.

Example

llm.setTokenCallback({
  tokenCallback: (token) => console.log('Token:', token)
});

getGeneratedTokenCount()

getGeneratedTokenCount(): number

Returns the number of tokens generated in the last response.

Returns

The count of generated tokens.

getPromptTokensCount()

getPromptTokensCount(): number

Returns the number of prompt tokens in the last message.

Returns

The count of prompt tokens.

getTotalTokensCount()

getTotalTokensCount(): number

Returns the total number of tokens from the previous generation (sum of prompt and generated tokens).

Returns

The count of prompt and generated tokens.

delete()

delete(): void

Deletes the model from memory. You cannot delete the model while it’s generating - you need to interrupt it first.

Example

llm.interrupt();
// Wait for generation to stop
llm.delete();

Complete Example

import { LLMModule } from 'react-native-executorch';

// Create instance
const llm = new LLMModule({
  tokenCallback: (token) => {
    process.stdout.write(token);
  },
  messageHistoryCallback: (history) => {
    console.log('\nConversation updated:', history.length, 'messages');
  }
});

// Load model
await llm.load({
  modelSource: 'https://example.com/llama-3.2-1B.pte',
  tokenizerSource: 'https://example.com/tokenizer.json',
  tokenizerConfigSource: 'https://example.com/tokenizer_config.json'
}, (progress) => {
  console.log(`Loading: ${(progress * 100).toFixed(0)}%`);
});

// Configure
llm.configure({
  chatConfig: {
    systemPrompt: 'You are a helpful coding assistant.'
  },
  generationConfig: {
    temperature: 0.7,
    maxTokens: 256
  }
});

// Send messages
const history = await llm.sendMessage('Explain React hooks');

// Check token usage
console.log('Tokens used:', llm.getTotalTokensCount());

// Clean up
llm.delete();

Initialization

LLM Hooks

Computer Vision Hooks

Speech Hooks

Text Embeddings Hooks

General Hooks

Modules

Types

Constants

Errors

​Overview

​When to Use

​Constructor

​Parameters

​Example

​Methods

​load()

​Parameters

​Example

​configure()

​Parameters

​Example

​forward()

​Parameters

​Returns

​Example

​generate()

​Parameters

​Returns

​Example

​sendMessage()

​Parameters

​Returns

​Example

​deleteMessage()

​Parameters

​Returns

​Example

​interrupt()

​Example

​setTokenCallback()

​Parameters

​Example

​getGeneratedTokenCount()

​Returns

​getPromptTokensCount()

​Returns

​getTotalTokensCount()

​Returns

​delete()

​Example

​Complete Example

​See Also

Build docs developers (and LLMs) love

Overview

When to Use

Constructor

Parameters

Example

Methods

load()

Parameters

Example

configure()

Parameters

Example

forward()

Parameters

Returns

Example

generate()

Parameters

Returns

Example

sendMessage()

Parameters

Returns

Example

deleteMessage()

Parameters

Returns

Example

interrupt()

Example

setTokenCallback()

Parameters

Example

getGeneratedTokenCount()

Returns

getPromptTokensCount()

Returns

getTotalTokensCount()

Returns

delete()

Example

Complete Example

See Also