useLLM

Overview

The useLLM hook manages a Large Language Model (LLM) instance for text generation and chat applications. It handles model loading, conversation management, token generation, and provides methods for configuration and inference.

Import

import { useLLM } from 'react-native-executorch';

Hook Signature

const llm = useLLM({ model, preventLoad }: LLMProps): LLMType

Parameters

model

object

required

Object containing model sources

Show properties

modelSource

ResourceSource

required

Source location of the model binary file (.pte)

tokenizerSource

ResourceSource

required

Source location of the tokenizer JSON file

tokenizerConfigSource

ResourceSource

Optional source location of the tokenizer config JSON file

preventLoad

boolean

default:"false"

If true, prevents automatic model loading and downloading when the hook mounts

Return Value

Returns an object with the following properties and methods:

State Properties

messageHistory

Message[]

Array of all messages in the conversation. Updated after each model response.

response

string

Current generated response. Updated with each token generated by the model.

token

string

The most recently generated token.

isReady

boolean

Indicates whether the model is loaded and ready for inference.

isGenerating

boolean

Indicates whether the model is currently generating a response.

downloadProgress

number

Download progress as a value between 0 and 1.

error

RnExecutorchError | null

Contains error details if the model fails to load or encounters an error during inference.

Methods

configure

function

Configures chat and tool calling settings.

configure(config: LLMConfig): void

Show parameters

chatConfig

Partial<ChatConfig>

Chat management configuration

Show properties

systemPrompt

string

System instructions for the model (e.g., “Be a helpful translator”)

initialMessageHistory

Message[]

Initial conversation history to provide context

contextStrategy

ContextStrategy

Strategy for managing conversation context window

toolsConfig

ToolsConfig

Tool calling configuration (requires model support)

Show properties

tools

LLMTool[]

Array of tool definitions

executeToolCallback

(call: ToolCall) => Promise<string | null>

Callback to execute tools and return results to the model

displayToolCalls

boolean

If true, JSON tool calls are displayed in chat

generationConfig

GenerationConfig

Text generation configuration

Show properties

temperature

number

Controls randomness/creativity (higher = more random)

topp

number

Nucleus sampling threshold

outputTokenBatchSize

number

Soft limit on tokens per batch

batchTimeInterval

number

Time interval between token batches (ms)

generate

function

Generates text completion for the provided messages without managing conversation context.

generate(messages: Message[], tools?: LLMTool[]): Promise<string>

Show parameters

messages

Message[]

required

Array of messages representing the chat history

tools

LLMTool[]

Optional array of tools for the model to use

Returns a promise that resolves to the generated text.

sendMessage

function

Sends a user message and manages conversation context automatically.

sendMessage(message: string): Promise<string>

Show parameters

message

string

required

The user message to send

Returns a promise that resolves to the model’s response. Updates messageHistory with both the user message and model response.

deleteMessage

function

Deletes all messages starting from the specified index.

deleteMessage(index: number): void

Show parameters

index

number

required

Index of the message to delete

Updates messageHistory after deletion.

interrupt

function

Interrupts the current text generation.

interrupt(): void

getGeneratedTokenCount

function

Returns the number of tokens generated in the current generation.

getGeneratedTokenCount(): number

getPromptTokenCount

function

Returns the number of prompt tokens in the last message.

getPromptTokenCount(): number

getTotalTokenCount

function

Returns the total number of tokens (prompt + generated) from the previous generation.

getTotalTokenCount(): number

Usage Examples

Basic Chat Application

import { useLLM } from 'react-native-executorch';
import { useState } from 'react';

function ChatScreen() {
  const [input, setInput] = useState('');
  
  const llm = useLLM({
    model: {
      modelSource: 'https://huggingface.co/.../model.pte',
      tokenizerSource: 'https://huggingface.co/.../tokenizer.json',
    },
  });
  
  const handleSend = async () => {
    if (!input.trim() || !llm.isReady) return;
    
    try {
      await llm.sendMessage(input);
      setInput('');
    } catch (error) {
      console.error('Generation failed:', error);
    }
  };
  
  return (
    <View>
      <Text>Status: {llm.isReady ? 'Ready' : 'Loading...'}</Text>
      <Text>Progress: {(llm.downloadProgress * 100).toFixed(0)}%</Text>
      
      <ScrollView>
        {llm.messageHistory.map((msg, idx) => (
          <View key={idx}>
            <Text>{msg.role}: {msg.content}</Text>
          </View>
        ))}
        
        {llm.isGenerating && (
          <View>
            <Text>assistant: {llm.response}</Text>
          </View>
        )}
      </ScrollView>
      
      <TextInput
        value={input}
        onChangeText={setInput}
        placeholder="Type a message..."
      />
      <Button title="Send" onPress={handleSend} disabled={!llm.isReady} />
    </View>
  );
}

Configured LLM with System Prompt

import { useLLM } from 'react-native-executorch';
import { useEffect } from 'react';

function TranslatorApp() {
  const llm = useLLM({
    model: {
      modelSource: require('./models/llama-3.2-1b.pte'),
      tokenizerSource: require('./models/tokenizer.json'),
    },
  });
  
  useEffect(() => {
    if (llm.isReady) {
      llm.configure({
        chatConfig: {
          systemPrompt: 'You are a helpful translator. Translate user messages to French.',
          initialMessageHistory: [],
        },
        generationConfig: {
          temperature: 0.7,
          topp: 0.9,
        },
      });
    }
  }, [llm.isReady]);
  
  return (
    <View>
      {/* UI implementation */}
    </View>
  );
}

Direct Generation (No Context)

import { useLLM } from 'react-native-executorch';

function SummarizationTool() {
  const llm = useLLM({
    model: {
      modelSource: 'https://example.com/model.pte',
      tokenizerSource: 'https://example.com/tokenizer.json',
    },
  });
  
  const summarize = async (text: string) => {
    const messages = [
      { role: 'system', content: 'Summarize the following text concisely.' },
      { role: 'user', content: text },
    ];
    
    const summary = await llm.generate(messages);
    return summary;
  };
  
  return (
    <View>
      {/* UI implementation */}
    </View>
  );
}

Streaming Tokens

import { useLLM } from 'react-native-executorch';
import { useEffect } from 'react';

function StreamingChat() {
  const llm = useLLM({
    model: {
      modelSource: require('./models/model.pte'),
      tokenizerSource: require('./models/tokenizer.json'),
    },
  });
  
  // Display each token as it's generated
  useEffect(() => {
    if (llm.token) {
      console.log('New token:', llm.token);
    }
  }, [llm.token]);
  
  return (
    <View>
      <Text>Current response: {llm.response}</Text>
    </View>
  );
}

Types

Message

interface Message {
  role: 'user' | 'assistant' | 'system';
  content: string;
}

ToolCall

interface ToolCall {
  toolName: string;
  arguments: Object;
}

ContextStrategy

interface ContextStrategy {
  buildContext(
    systemPrompt: string,
    history: Message[],
    maxContextLength: number,
    getTokenCount: (messages: Message[]) => number
  ): Message[];
}

Notes

The hook automatically loads the model when mounted unless preventLoad is set to true.

The model and tokenizer files can be large. Monitor downloadProgress to provide user feedback during initial download.

Use getTokenCount methods to monitor token usage and optimize context management for your use case.

Initialization

LLM Hooks

Computer Vision Hooks

Speech Hooks

Text Embeddings Hooks

General Hooks

Modules

Types

Constants

Errors

Overview

Import

Hook Signature

Parameters

Return Value

State Properties

Methods

Usage Examples

Basic Chat Application

Configured LLM with System Prompt

Direct Generation (No Context)

Streaming Tokens

Types

Message

ToolCall

ContextStrategy

Notes

See Also

Build docs developers (and LLMs) love

Initialization

LLM Hooks

Computer Vision Hooks

Speech Hooks

Text Embeddings Hooks

General Hooks

Modules

Types

Constants

Errors

​Overview

​Import

​Hook Signature

​Parameters

​Return Value

​State Properties

​Methods

​Usage Examples

​Basic Chat Application

​Configured LLM with System Prompt

​Direct Generation (No Context)

​Streaming Tokens

​Types

​Message

​ToolCall

​ContextStrategy

​Notes

​See Also

Build docs developers (and LLMs) love

Overview

Import

Hook Signature

Parameters

Return Value

State Properties

Methods

Usage Examples

Basic Chat Application

Configured LLM with System Prompt

Direct Generation (No Context)

Streaming Tokens

Types

Message

ToolCall

ContextStrategy

Notes

See Also