Python client

The Gradio Python client makes it easy to use any Gradio app as an API. You can call Gradio apps hosted on Hugging Face Spaces, your own servers, or anywhere else with just a few lines of Python code.

Installation

The lightweight gradio_client package can be installed from pip and works with Python 3.10 or higher:

pip install --upgrade gradio_client

If you already have a recent version of gradio, then gradio_client is included as a dependency.

Quick start

Here’s a simple example using the Whisper transcription Space:

from gradio_client import Client, handle_file

client = Client("abidlabs/whisper")

result = client.predict(
    audio=handle_file("audio_sample.wav")
)

print(result)
# "This is a test of the whisper speech recognition model."

Connecting to apps

Connect to Hugging Face Spaces

Connect to a Gradio app by passing the Space name to the Client constructor:

from gradio_client import Client

client = Client("abidlabs/en2fr")  # English to French translator

Connect to private Spaces

For private Spaces, pass your Hugging Face token:

from gradio_client import Client

client = Client("abidlabs/my-private-space", token="hf_...")

You can get your HF token at https://huggingface.co/settings/tokens.

Connect to custom URLs

If your app is running on your own server, provide the full URL:

from gradio_client import Client

client = Client("https://bec81a83-5b5c-471e.gradio.live")

Connect with authentication

If the app requires username and password authentication:

from gradio_client import Client

client = Client(
    "username/space-name",
    auth=("username", "password")
)

Duplicate a Space for unlimited usage

While you can use any public Space as an API, you may get rate limited if you make too many requests. For unlimited usage, duplicate the Space to create a private copy:

import os
from gradio_client import Client, handle_file

HF_TOKEN = os.environ.get("HF_TOKEN")

client = Client.duplicate("abidlabs/whisper", token=HF_TOKEN)
result = client.predict(handle_file("audio_sample.wav"))

print(result)
# "This is a test of the whisper speech recognition model."

If you’ve previously duplicated a Space, duplicate() will attach to the existing Space instead of creating a new one.

If the original Space uses GPUs, your duplicated Space will also use GPUs and your Hugging Face account will be billed. Your Space will automatically sleep after 1 hour of inactivity. You can customize the hardware using the hardware parameter.

Inspect API endpoints

Use view_api() to see available endpoints and their parameters:

from gradio_client import Client

client = Client("abidlabs/whisper")
client.view_api()

Client.predict() Usage Info
---------------------------
Named API endpoints: 1

 - predict(audio, api_name="/predict") -> output
    Parameters:
     - [Audio] audio: filepath (required)  
    Returns:
     - [Textbox] output: str

Alternatively, click the “Use via API” link in the footer of any Gradio app to view the API page in your browser.

Make predictions

Basic prediction

Call .predict() with the appropriate arguments:

from gradio_client import Client

client = Client("abidlabs/en2fr")
result = client.predict("Hello", api_name="/predict")

print(result)
# "Bonjour"

Multiple parameters

For endpoints with multiple parameters, use keyword arguments:

from gradio_client import Client

client = Client("gradio/calculator")
result = client.predict(num1=4, operation="add", num2=5)

print(result)
# 9.0

File inputs

For file or URL inputs, use handle_file():

from gradio_client import Client, handle_file

client = Client("abidlabs/whisper")
result = client.predict(
    audio=handle_file("https://audio-samples.github.io/samples/mp3/blizzard_unconditional/sample-0.mp3")
)

print(result)
# "My thought I have nobody by a beauty and will as you poured..."

Async operations

Submit jobs asynchronously

The .predict() method blocks until the operation completes. Use .submit() to run jobs in the background:

from gradio_client import Client

client = Client("abidlabs/en2fr")
job = client.submit("Hello", api_name="/predict")  # Non-blocking

# Do other work...

result = job.result()  # Blocks until result is ready
print(result)
# "Bonjour"

Add callbacks

Execute functions when jobs complete:

from gradio_client import Client

def print_result(result):
    print(f"Translation: {result}")

client = Client("abidlabs/en2fr")
job = client.submit(
    "Hello",
    api_name="/predict",
    result_callbacks=[print_result]
)

# Do other work...
# "Translation: Bonjour" (printed when job completes)

Check job status

Monitor job progress with .status():

from gradio_client import Client

client = Client("gradio/calculator")
job = client.submit(5, "add", 4, api_name="/predict")

status = job.status()
print(status)
# <Status.STARTING: 'STARTING'>

# Check if job is complete
if job.done():
    print(job.result())

The StatusUpdate object includes:

code: Status code (e.g., STARTING, PENDING, COMPLETE)
rank: Position in queue
queue_size: Total queue size
eta: Estimated completion time
success: Whether job completed successfully
time: When status was generated

Cancel jobs

Cancel queued jobs that haven’t started:

from gradio_client import Client, handle_file

client = Client("abidlabs/whisper")
job1 = client.submit(handle_file("audio_sample1.wav"))
job2 = client.submit(handle_file("audio_sample2.wav"))

job1.cancel()  # Returns False (job already started)
job2.cancel()  # Returns True (job cancelled successfully)

Generator endpoints

Some endpoints return multiple values over time. Access all outputs with .outputs():

import time
from gradio_client import Client

client = Client("gradio/count_generator")
job = client.submit(3, api_name="/count")

while not job.done():
    time.sleep(0.1)

print(job.outputs())
# ['0', '1', '2']

Iterate over results

Use the job as an iterator to process results as they arrive:

from gradio_client import Client

client = Client("gradio/count_generator")
job = client.submit(3, api_name="/count")

for output in job:
    print(output)
# 0
# 1
# 2

Cancel iterative jobs

Cancel generator endpoints mid-stream:

import time
from gradio_client import Client

client = Client("abidlabs/test-yield")
job = client.submit("abcdef")

time.sleep(3)
job.cancel()  # Cancels after current iteration completes

Session state

The Python client automatically handles session state for you. When an endpoint uses gr.State, the state is stored internally and passed automatically in subsequent requests. Here’s an example with a stateful word counter:

import gradio as gr
from gradio_client import Client

# Server app with state
def count(word, list_of_words):
    return list_of_words.count(word), list_of_words + [word]

with gr.Blocks() as demo:
    words = gr.State([])
    textbox = gr.Textbox()
    number = gr.Number()
    textbox.submit(count, inputs=[textbox, words], outputs=[number, words])

demo.launch()

# Client usage
client = Client("http://localhost:7860")

# State is automatically maintained between calls
print(client.predict("hello", api_name="/count"))  # 0
print(client.predict("hello", api_name="/count"))  # 1
print(client.predict("world", api_name="/count"))  # 0
print(client.predict("hello", api_name="/count"))  # 2

# Reset state when needed
client.reset_session()
print(client.predict("hello", api_name="/count"))  # 0

You don’t need to manage state parameters manually - the client handles this automatically.

Get Started

Core Concepts

Building Interfaces

Building with Blocks

Chatbots

Advanced Features

Custom Components

Clients & Deployment

Installation

Quick start

Connecting to apps

Connect to Hugging Face Spaces

Connect to private Spaces

Connect to custom URLs

Connect with authentication

Duplicate a Space for unlimited usage

Inspect API endpoints

Make predictions

Basic prediction

Multiple parameters

File inputs

Async operations

Submit jobs asynchronously

Add callbacks

Check job status

Cancel jobs

Generator endpoints

Iterate over results

Cancel iterative jobs

Session state

Next steps

JavaScript client

LLM agents

Build docs developers (and LLMs) love

Get Started

Core Concepts

Building Interfaces

Building with Blocks

Chatbots

Advanced Features

Custom Components

Clients & Deployment

​Installation

​Quick start

​Connecting to apps

​Connect to Hugging Face Spaces

​Connect to private Spaces

​Connect to custom URLs

​Connect with authentication

​Duplicate a Space for unlimited usage

​Inspect API endpoints

​Make predictions

​Basic prediction

​Multiple parameters

​File inputs

​Async operations

​Submit jobs asynchronously

​Add callbacks

​Check job status

​Cancel jobs

​Generator endpoints

​Iterate over results

​Cancel iterative jobs

​Session state

​Next steps

JavaScript client

LLM agents

Build docs developers (and LLMs) love

Installation

Quick start

Connecting to apps

Connect to Hugging Face Spaces

Connect to private Spaces

Connect to custom URLs

Connect with authentication

Duplicate a Space for unlimited usage

Inspect API endpoints

Make predictions

Basic prediction

Multiple parameters

File inputs

Async operations

Submit jobs asynchronously

Add callbacks

Check job status

Cancel jobs

Generator endpoints

Iterate over results

Cancel iterative jobs

Session state

Next steps