Overview
The basic example shows how to:- Create a model and tokenizer
- Set up a generator with custom parameters
- Process user input and generate responses
- Stream output tokens in real-time
Prerequisites
Before running the examples, you need to:- Install ONNX Runtime GenAI headers and libraries
- Download a compatible model
- Set up your build environment with CMake
Simple Question-Answering Example
This example demonstrates streaming text generation with the C++ API:Key Components
Model Initialization
The example starts by creating the core components:Generator Setup
Set up the generator with custom parameters:Streaming Output
The generation loop streams tokens as they’re generated:Building the Example
Use CMake to build the example:Running the Example
Run the compiled example with your model:Command-Line Options
-m, --model_path: Path to the model folder containing GenAI config-e, --execution_provider: Execution provider (cpu, cuda, dml, etc.)-s, --system_prompt: System prompt for the model (default: “You are a helpful AI assistant.”)-u, --user_prompt: User prompt (default: “What color is the sky?”)-v, --verbose: Enable verbose logging--interactive: Run in interactive mode for multi-turn conversations
Example Output
Next Steps
- Explore advanced features like multi-turn chat and custom configurations
- Learn about execution providers
- Understand model configuration