Skip to main content

Endpoint

POST /v1/images/generations

Prerequisites

Start h2oGPT with image generation enabled and at least one image model pre-loaded:
python generate.py \
  --enable_image=True \
  --pre_load_image_audio_models=True \
  "--visible_image_models=['sdxl_turbo']"
Multiple models can be passed as a list:
"--visible_image_models=['sdxl_turbo', 'sdxl', 'flux']"

Request parameters

model
string
default:""
The image generation model to use. When empty or omitted, h2oGPT uses the first available loaded model. See supported models below.
prompt
string
required
Text description of the image to generate.
n
number
default:"1"
Number of images to generate.
size
string
default:"1024x1024"
Dimensions of the generated image in "WxH" format. Accepted values depend on the model; "1024x1024" works for most.
quality
string
default:"standard"
Quality preset. Accepted value: "standard".
response_format
string
default:"b64_json"
How to return the image data. "b64_json" returns a base64-encoded JSON string; "url" returns a data URI.
guidance_scale
number
Classifier-free guidance scale. Controls how closely the output follows the prompt. Defaults are model-dependent.
num_inference_steps
number
Number of denoising steps. Higher values increase quality at the cost of latency. Defaults are model-dependent.

Response

created
number
Unix timestamp of when the images were generated.
data
object[]
Array of generated image objects.

Examples

from openai import OpenAI
import base64
import io
from PIL import Image

client = OpenAI(
    base_url="http://localhost:5000/v1",
    api_key="EMPTY",
)

response = client.images.generate(
    model="sdxl_turbo",  # leave empty to use the first loaded model
    prompt="A cute baby sea otter",
    n=1,
    size="1024x1024",
    response_format="b64_json",
)

image_data = base64.b64decode(response.data[0].b64_json.encode("utf-8"))
image = Image.open(io.BytesIO(image_data))
image.save("output_image.png")
image.show()

Supported models

The following model identifiers are recognized. Pass the identifier as the model parameter or use it in --visible_image_models at startup.
ModelNotes
sdxl_turboFast single-step generation. Good for rapid iteration.
sdxlFull SDXL pipeline. Higher quality, slower than sdxl_turbo.
SD3Stable Diffusion 3. Requires a HuggingFace access token.
playv2PlaygroundAI v2.
fluxFlux diffusion model.
If you do not know which model is loaded, omit the model field or set it to an empty string. h2oGPT will select the first available image model.

Decoding the response

When using response_format="b64_json":
import base64, io
from PIL import Image

b64_string = response.data[0].b64_json
image_bytes = base64.b64decode(b64_string.encode("utf-8"))
image = Image.open(io.BytesIO(image_bytes))
image.save("result.png")
When using response_format="url", the value is a data:image/jpg;base64,... URI that can be set directly as an <img> src attribute or decoded with base64.b64decode.

Build docs developers (and LLMs) love