useLLM Hook
The useLLM hook provides a comprehensive interface for managing Large Language Model instances in React Native.
Import
import { useLLM } from 'react-native-executorch';
import { LLAMA3_2_1B } from 'react-native-executorch/constants';
Parameters
The hook accepts an object of type LLMProps:
Model configuration object
Location of the model binary file (.pte)
Location of the tokenizer JSON file
Location of the tokenizer config JSON file
Prevent automatic model loading and downloading on mount. Useful when you want to defer loading.
Example
const llm = useLLM({
model: LLAMA3_2_1B,
preventLoad: false, // Load immediately
});
Return Values
The hook returns an object of type LLMType with the following properties and methods:
State Properties
Array containing all messages in the conversation. Updated after the model responds to sendMessage.Each message has:
role: “user” | “assistant” | “system”
content: string
The accumulated generated response. Updated with each token during generation.
The most recently generated token. Updates in real-time during generation.
Indicates whether the model has finished loading and is ready for inference.
Indicates whether the model is currently generating a response.
Download progress as a value between 0 and 1. Useful for showing progress indicators during initial model download.
Contains error information if the model failed to load or an error occurred during generation.
Methods
Configures the LLM’s chat behavior, tool calling, and generation parameters.configure: (config: LLMConfig) => void
See Chat Configuration for details.
Sends a user message and generates a response. Automatically manages conversation history.sendMessage: (message: string) => Promise<string>
Parameters:
message: The user’s message string
Returns: Promise resolving to the model’s complete responseExample:const response = await llm.sendMessage('Hello, how are you?');
console.log(response); // "I'm doing well, thank you for asking!"
Runs inference on a custom message array. Does not manage conversation context automatically.generate: (messages: Message[], tools?: LLMTool[]) => Promise<string>
Parameters:
messages: Array of Message objects representing the chat history
tools: Optional array of tool definitions for tool calling
Returns: Promise resolving to the generated textExample:const messages = [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is 2+2?' },
];
const response = await llm.generate(messages);
Deletes all messages starting from the specified index. Updates messageHistory.deleteMessage: (index: number) => void
Parameters:
index: The index of the message to start deleting from
Example:// Delete the last 2 messages
llm.deleteMessage(llm.messageHistory.length - 2);
Interrupts the current inference generation.Example:<Button onPress={() => llm.interrupt()} title="Stop" />
Returns the number of tokens generated in the current/last generation.getGeneratedTokenCount: () => number
Returns the number of tokens in the prompt of the last generation.getPromptTokenCount: () => number
Returns the total number of tokens (prompt + generated) from the last generation.getTotalTokenCount: () => number
Example:useEffect(() => {
if (!llm.isGenerating && llm.response) {
console.log('Prompt tokens:', llm.getPromptTokenCount());
console.log('Generated tokens:', llm.getGeneratedTokenCount());
console.log('Total tokens:', llm.getTotalTokenCount());
}
}, [llm.isGenerating]);
Complete Example
import React, { useEffect, useState } from 'react';
import { View, Text, TextInput, Button, FlatList } from 'react-native';
import { useLLM } from 'react-native-executorch';
import { LLAMA3_2_1B } from 'react-native-executorch/constants';
export default function ChatApp() {
const [input, setInput] = useState('');
const llm = useLLM({ model: LLAMA3_2_1B });
useEffect(() => {
if (llm.isReady) {
llm.configure({
chatConfig: {
systemPrompt: 'You are a helpful and friendly AI assistant.',
},
generationConfig: {
temperature: 0.7,
topp: 0.9,
},
});
}
}, [llm.isReady]);
const handleSend = async () => {
if (input.trim() && !llm.isGenerating) {
const message = input;
setInput('');
await llm.sendMessage(message);
}
};
if (llm.error) {
return <Text>Error: {llm.error.message}</Text>;
}
if (!llm.isReady) {
return (
<View>
<Text>Loading model...</Text>
<Text>{Math.round(llm.downloadProgress * 100)}%</Text>
</View>
);
}
return (
<View style={{ flex: 1 }}>
<FlatList
data={llm.messageHistory}
keyExtractor={(_, idx) => idx.toString()}
renderItem={({ item }) => (
<View>
<Text style={{ fontWeight: 'bold' }}>{item.role}:</Text>
<Text>{item.content}</Text>
</View>
)}
/>
{llm.isGenerating && (
<View>
<Text style={{ fontWeight: 'bold' }}>Assistant:</Text>
<Text>{llm.response}</Text>
</View>
)}
<View style={{ flexDirection: 'row' }}>
<TextInput
value={input}
onChangeText={setInput}
placeholder="Type a message..."
style={{ flex: 1 }}
/>
<Button
title="Send"
onPress={handleSend}
disabled={llm.isGenerating}
/>
</View>
</View>
);
}
Type Definitions
interface Message {
role: 'user' | 'assistant' | 'system';
content: string;
}
type ResourceSource = string | { uri: string } | number;
interface LLMProps {
model: {
modelSource: ResourceSource;
tokenizerSource: ResourceSource;
tokenizerConfigSource?: ResourceSource;
};
preventLoad?: boolean;
}
interface LLMType {
messageHistory: Message[];
response: string;
token: string;
isReady: boolean;
isGenerating: boolean;
downloadProgress: number;
error: RnExecutorchError | null;
configure: (config: LLMConfig) => void;
generate: (messages: Message[], tools?: LLMTool[]) => Promise<string>;
sendMessage: (message: string) => Promise<string>;
deleteMessage: (index: number) => void;
interrupt: () => void;
getGeneratedTokenCount: () => number;
getPromptTokenCount: () => number;
getTotalTokenCount: () => number;
}