Image generation

Endpoint

POST /v1/images/generations

Prerequisites

Start h2oGPT with image generation enabled and at least one image model pre-loaded:

python generate.py \
  --enable_image=True \
  --pre_load_image_audio_models=True \
  "--visible_image_models=['sdxl_turbo']"

Multiple models can be passed as a list:

"--visible_image_models=['sdxl_turbo', 'sdxl', 'flux']"

Request parameters

model

string

default:""

The image generation model to use. When empty or omitted, h2oGPT uses the first available loaded model. See supported models below.

prompt

string

required

Text description of the image to generate.

number

default:"1"

Number of images to generate.

size

string

default:"1024x1024"

Dimensions of the generated image in "WxH" format. Accepted values depend on the model; "1024x1024" works for most.

quality

string

default:"standard"

Quality preset. Accepted value: "standard".

response_format

string

default:"b64_json"

How to return the image data. "b64_json" returns a base64-encoded JSON string; "url" returns a data URI.

guidance_scale

number

Classifier-free guidance scale. Controls how closely the output follows the prompt. Defaults are model-dependent.

num_inference_steps

number

Number of denoising steps. Higher values increase quality at the cost of latency. Defaults are model-dependent.

Response

created

number

Unix timestamp of when the images were generated.

data

object[]

Array of generated image objects.

Show properties

b64_json

string

Base64-encoded image data. Present when response_format="b64_json".

url

string

Data URI of the image. Present when response_format="url".

Examples

from openai import OpenAI
import base64
import io
from PIL import Image

client = OpenAI(
    base_url="http://localhost:5000/v1",
    api_key="EMPTY",
)

response = client.images.generate(
    model="sdxl_turbo",  # leave empty to use the first loaded model
    prompt="A cute baby sea otter",
    n=1,
    size="1024x1024",
    response_format="b64_json",
)

image_data = base64.b64decode(response.data[0].b64_json.encode("utf-8"))
image = Image.open(io.BytesIO(image_data))
image.save("output_image.png")
image.show()

Supported models

The following model identifiers are recognized. Pass the identifier as the model parameter or use it in --visible_image_models at startup.

Model	Notes
`sdxl_turbo`	Fast single-step generation. Good for rapid iteration.
`sdxl`	Full SDXL pipeline. Higher quality, slower than `sdxl_turbo`.
`SD3`	Stable Diffusion 3. Requires a HuggingFace access token.
`playv2`	PlaygroundAI v2.
`flux`	Flux diffusion model.

If you do not know which model is loaded, omit the model field or set it to an empty string. h2oGPT will select the first available image model.

Decoding the response

When using response_format="b64_json":

import base64, io
from PIL import Image

b64_string = response.data[0].b64_json
image_bytes = base64.b64decode(b64_string.encode("utf-8"))
image = Image.open(io.BytesIO(image_bytes))
image.save("result.png")

When using response_format="url", the value is a data:image/jpg;base64,... URI that can be set directly as an <img> src attribute or decoded with base64.b64decode.

OpenAI-Compatible API

Gradio Client API

Image generation

Endpoint

Prerequisites

Request parameters

Response

Examples

Supported models

Decoding the response

Build docs developers (and LLMs) love

OpenAI-Compatible API

Gradio Client API

​Endpoint

​Prerequisites

​Request parameters

​Response

​Examples

​Supported models

​Decoding the response

Build docs developers (and LLMs) love

Endpoint

Prerequisites

Request parameters

Response

Examples

Supported models

Decoding the response