Chat Configuration
Theconfigure method allows you to customize how your LLM handles conversations, manages context, and generates responses.
Configuration Overview
Chat Configuration
ThechatConfig object manages conversation behavior and context handling.
System Prompt
Define the model’s personality and instructions:Instructions that define the model’s behavior and personality. This appears at the start of every conversation context.Common use cases:
- Setting personality (“You are a friendly tutor”)
- Defining expertise (“You are an expert in TypeScript”)
- Setting constraints (“Answer in one sentence”)
- Formatting output (“Respond in JSON format”)
Initial Message History
Provide conversation context at initialization:Pre-populate the conversation history. Useful for:
- Resuming saved conversations
- Providing examples (few-shot prompting)
- Setting conversation tone
Context Strategy
Manage how conversation history fits within the model’s context window:Strategy for managing the conversation context window. See Context Strategies for detailed information.
Tools Configuration
Enable the model to call external functions. See Tool Calling for complete details.Array of tool definitions. Format depends on your model’s chat template.
Async function that receives a
ToolCall and returns the result as a string.Whether to include JSON tool call representations in the message history. If
false, only the final answers are shown.Generation Configuration
Control how the model generates text:Controls randomness and creativity in generation.
- Lower values (0.1-0.5): More focused, deterministic, factual
- Medium values (0.6-0.9): Balanced creativity and coherence
- Higher values (1.0+): More creative, diverse, potentially less coherent
Nucleus sampling parameter. Only samples from tokens whose cumulative probability exceeds this value.
- 0.9: Recommended for most use cases (good balance)
- 0.95: Slightly more diverse
- 1.0: Consider all tokens (no filtering)
Soft upper limit on tokens per batch for streaming. Higher values reduce update frequency but may improve performance.
This is a “soft” limit. In certain cases (like emoji sequences), batches may be larger to avoid breaking combined characters.
Maximum time (in milliseconds) between token batch emissions. Works with
outputTokenBatchSize.Complete Configuration Example
Dynamic Reconfiguration
You can callconfigure multiple times to adjust behavior:
Type Definitions
Best Practices
- Always set a system prompt - It helps guide the model’s behavior consistently
- Choose appropriate temperature - Lower for factual tasks, higher for creative tasks
- Use context strategies - Prevent context overflow errors with
SlidingWindowContextStrategy - Configure on mount - Set up configuration in a
useEffectafterisReadybecomes true - Batch token updates - Use
outputTokenBatchSize > 1for better performance in production