Batch processing

Gradio supports the ability to pass batch functions, which are functions that take in a list of inputs and return a list of predictions. This can significantly improve performance when processing multiple requests.

What are batch functions?

Batch functions are functions which take in a list of inputs and return a list of predictions. Instead of processing requests one at a time, batch functions allow you to process multiple requests simultaneously. For example, here’s a batched function that takes in two lists of inputs (a list of words and a list of ints), and returns a list of trimmed words as output:

import time

def trim_words(words, lens):
    trimmed_words = []
    time.sleep(5)
    for w, l in zip(words, lens):
        trimmed_words.append(w[:int(l)])
    return [trimmed_words]

Why use batch functions?

The advantage of using batched functions is that if you enable queuing, the Gradio server can automatically batch incoming requests and process them in parallel, potentially speeding up your demo. In the example above, 16 requests could be processed in parallel (for a total inference time of 5 seconds), instead of each request being processed separately (for a total inference time of 80 seconds).

Using batch functions with Interface

With the gr.Interface class, you can enable batching by setting batch=True and specifying a max_batch_size:

import gradio as gr

def trim_words(words, lens):
    trimmed_words = []
    for w, l in zip(words, lens):
        trimmed_words.append(w[:int(l)])
    return [trimmed_words]

demo = gr.Interface(
    fn=trim_words,
    inputs=["textbox", "number"],
    outputs=["textbox"],
    batch=True,
    max_batch_size=16
)

demo.launch()

Using batch functions with Blocks

With the gr.Blocks class, you can specify batch parameters in the event listener:

import gradio as gr

def trim_words(words, lens):
    trimmed_words = []
    for w, l in zip(words, lens):
        trimmed_words.append(w[:int(l)])
    return [trimmed_words]

with gr.Blocks() as demo:
    with gr.Row():
        word = gr.Textbox(label="Word")
        length = gr.Number(label="Length")
        output = gr.Textbox(label="Output")
    with gr.Row():
        run = gr.Button("Process")

    event = run.click(
        trim_words,
        [word, length],
        output,
        batch=True,
        max_batch_size=16
    )

demo.launch()