Use any Gradio app as an API with the gradio_client Python library
The Gradio Python client makes it easy to use any Gradio app as an API. You can call Gradio apps hosted on Hugging Face Spaces, your own servers, or anywhere else with just a few lines of Python code.
from gradio_client import Client, handle_fileclient = Client("abidlabs/whisper")result = client.predict( audio=handle_file("audio_sample.wav"))print(result)# "This is a test of the whisper speech recognition model."
While you can use any public Space as an API, you may get rate limited if you make too many requests. For unlimited usage, duplicate the Space to create a private copy:
import osfrom gradio_client import Client, handle_fileHF_TOKEN = os.environ.get("HF_TOKEN")client = Client.duplicate("abidlabs/whisper", token=HF_TOKEN)result = client.predict(handle_file("audio_sample.wav"))print(result)# "This is a test of the whisper speech recognition model."
If you’ve previously duplicated a Space, duplicate() will attach to the existing Space instead of creating a new one.
If the original Space uses GPUs, your duplicated Space will also use GPUs and your Hugging Face account will be billed. Your Space will automatically sleep after 1 hour of inactivity. You can customize the hardware using the hardware parameter.
from gradio_client import Client, handle_fileclient = Client("abidlabs/whisper")result = client.predict( audio=handle_file("https://audio-samples.github.io/samples/mp3/blizzard_unconditional/sample-0.mp3"))print(result)# "My thought I have nobody by a beauty and will as you poured..."
The .predict() method blocks until the operation completes. Use .submit() to run jobs in the background:
from gradio_client import Clientclient = Client("abidlabs/en2fr")job = client.submit("Hello", api_name="/predict") # Non-blocking# Do other work...result = job.result() # Blocks until result is readyprint(result)# "Bonjour"
The Python client automatically handles session state for you. When an endpoint uses gr.State, the state is stored internally and passed automatically in subsequent requests.Here’s an example with a stateful word counter:
import gradio as grfrom gradio_client import Client# Server app with statedef count(word, list_of_words): return list_of_words.count(word), list_of_words + [word]with gr.Blocks() as demo: words = gr.State([]) textbox = gr.Textbox() number = gr.Number() textbox.submit(count, inputs=[textbox, words], outputs=[number, words])demo.launch()# Client usageclient = Client("http://localhost:7860")# State is automatically maintained between callsprint(client.predict("hello", api_name="/count")) # 0print(client.predict("hello", api_name="/count")) # 1print(client.predict("world", api_name="/count")) # 0print(client.predict("hello", api_name="/count")) # 2# Reset state when neededclient.reset_session()print(client.predict("hello", api_name="/count")) # 0
You don’t need to manage state parameters manually - the client handles this automatically.