Endpoint
/generate endpoint provides a simplified text generation interface for quick testing and development. For production use, prefer the OpenAI-compatible /v1/chat/completions endpoint.
Request Body
The input text prompt to generate from.
The maximum number of tokens to generate.
Whether to ignore the end-of-sequence token and continue generation.
Response Format
The endpoint returns a streaming response using Server-Sent Events (SSE). Each event contains incremental text output:data: [DONE] message.
Example
Basic Generation
Notes
The
/generate endpoint is primarily for testing and debugging. For production applications, use the /v1/chat/completions endpoint which provides more features and OpenAI API compatibility.