gen()

Overview

The gen() function generates text from the language model at the current position in your prompt program.

Syntax

sgl.gen(name, max_tokens=128, temperature=1.0, ...)

Parameters

name

str

Variable name to store the generated text. Access with state[name].

max_tokens

int

default:"128"

Maximum number of tokens to generate.

min_tokens

int

Minimum number of tokens to generate.

temperature

float

default:"1.0"

Sampling temperature. Higher values (e.g., 1.5) make output more random, lower values (e.g., 0.2) make it more deterministic.

top_p

float

default:"1.0"

Nucleus sampling threshold. Only tokens with cumulative probability up to top_p are considered.

top_k

int

default:"-1"

Top-k sampling. Only the top k most likely tokens are considered. -1 means disabled.

min_p

float

default:"0.0"

Minimum probability threshold for token sampling.

stop

str | List[str]

Stop sequences. Generation stops when any of these strings are generated.

stop_token_ids

List[int]

Token IDs that trigger generation to stop.

stop_regex

str | List[str]

Regular expressions that trigger generation to stop when matched.

frequency_penalty

float

default:"0.0"

Penalty for token frequency. Positive values reduce repetition.

presence_penalty

float

default:"0.0"

Penalty for token presence. Positive values encourage topic diversity.

ignore_eos

bool

default:"false"

Whether to ignore end-of-sequence tokens.

regex

str

Regular expression constraint. Generated text must match this pattern.

json_schema

str

JSON schema constraint. Generated text must be valid JSON matching this schema.

choices

List[str]

If provided, gen() behaves like select() and chooses from these options.

return_logprob

bool

Whether to return log probabilities for generated tokens.

logprob_start_len

int

Start position for computing log probabilities.

top_logprobs_num

int

Number of top log probabilities to return per token.

Usage

Basic Generation

import sglang as sgl

@sgl.function
def simple_gen(s):
    s += "The capital of France is"
    s += sgl.gen("answer", max_tokens=10)

state = simple_gen.run()
print(state["answer"])  # " Paris"

With Stop Sequences

@sgl.function
def generate_list(s):
    s += "List three colors:\n"
    s += sgl.gen("colors", max_tokens=50, stop="\n\n")

state = generate_list.run()
print(state["colors"])

Temperature Control

@sgl.function
def creative_writing(s, prompt):
    s += prompt
    s += sgl.gen("story", max_tokens=200, temperature=1.5)  # More creative

@sgl.function
def factual_qa(s, question):
    s += question
    s += sgl.gen("answer", max_tokens=50, temperature=0.0)  # Deterministic

Constrained Generation with Regex

@sgl.function
def generate_email(s):
    s += "Generate an email address:\n"
    s += sgl.gen(
        "email",
        max_tokens=30,
        regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
    )

JSON Schema Constraint

import json

@sgl.function
def generate_person(s):
    schema = json.dumps({
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "email": {"type": "string"}
        },
        "required": ["name", "age"]
    })
    
    s += "Generate a person:\n"
    s += sgl.gen("person", max_tokens=100, json_schema=schema)

state = generate_person.run()
person = json.loads(state["person"])
print(person["name"], person["age"])

Specialized Variants

gen_int()

Generates an integer value.

sgl.gen_int(name, max_tokens=10, ...)

Automatically constrains generation to match integer format (digits with optional +/- prefix). Example:

@sgl.function
def math_problem(s):
    s += "What is 25 + 17? Answer: "
    s += sgl.gen_int("result", max_tokens=5)

state = math_problem.run()
print(int(state["result"]))  # 42

gen_string()

Generates a string value.

sgl.gen_string(name, max_tokens=50, ...)

Automatically constrains generation to match quoted string format. Example:

@sgl.function
def extract_name(s, text):
    s += f"Extract the name from: {text}\nName: "
    s += sgl.gen_string("name", max_tokens=20)

state = extract_name.run(text="Hello, I'm Alice.")
print(state["name"])  # "Alice"

Accessing Generated Content

The generated text is stored in the state object and can be accessed by name:

state = my_function.run()
generated_text = state["variable_name"]

Python API

Frontend API

HTTP API

CLI Reference

Overview

Syntax

Parameters

Usage

Basic Generation

With Stop Sequences

Temperature Control

Constrained Generation with Regex

JSON Schema Constraint

Specialized Variants

gen_int()

gen_string()

Accessing Generated Content

See Also

Python API

Frontend API

HTTP API

CLI Reference

​Overview

​Syntax

​Parameters

​Usage

​Basic Generation

​With Stop Sequences

​Temperature Control

​Constrained Generation with Regex

​JSON Schema Constraint

​Specialized Variants

​gen_int()

​gen_string()

​Accessing Generated Content

​See Also

Overview

Syntax

Parameters

Usage

Basic Generation

With Stop Sequences

Temperature Control

Constrained Generation with Regex

JSON Schema Constraint

Specialized Variants

gen_int()

gen_string()

Accessing Generated Content

See Also