Table of Contents
Deploy Gradio Apps on Hugging Face Spaces
In our previous tutorial, Vision-Language Model: PaliGemma for Image Description Generator and More, we explored interactive applications of PaliGemma using Gradio.
While running a Gradio app in Google Colab is straightforward, it requires re-running all the cells each time we use it — a time-consuming process. Deploying the app on Hugging Face Spaces offers a more convenient alternative. It provides an always-available, interactive platform for anyone to access and use the model effortlessly.
In this tutorial, we will walk through the steps to deploy Gradio applications on Hugging Face Spaces, making them accessible and user-friendly.
This lesson is the 3rd of a 4-part series on Vision-Language Models:
- Fine Tune PaliGemma with QLoRA for Visual Question Answering
- Vision-Language Model: PaliGemma for Image Description Generator and More
- Deploy Gradio Apps on Hugging Face Spaces (this tutorial)
- Object Detection with PaliGemma
To learn how to deploy Gradio apps on Hugging Face Spaces, just keep reading.
Looking for the source code to this post?
Jump Right To The Downloads SectionNeed Help Configuring Your Development Environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code immediately on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
What Is Hugging Face Spaces?
Hugging Face Spaces is a platform for deploying and sharing machine learning (ML) applications with the community. It offers an interactive interface, enabling users to explore ML models directly in their browser without the need for local setup.
Spaces supports two primary SDKs (software development kits), Gradio and Streamlit, for building interactive ML demo apps in Python. Additionally, it allows hosting custom environments using arbitrary Dockerfiles and creating static Spaces with JavaScript and HTML.
This tutorial focuses on using Gradio to deploy the PaliGemma model on Hugging Face Spaces. Gradio simplifies the creation of interactive web-based interfaces for ML models, enabling users to upload inputs, view predictions, and interact with your model in real-time.
Setup
First, we navigate to Hugging Face and click on “Spaces” in the top header.
This takes us to the Hugging Face Spaces page. Next, we click on “Create new Space” to set up a new space.
On the Create a new Space page, we fill out a few essential fields:
- Owner: Select either the username or the organization under which we want the new space to be created.
- Space name: Enter a name for the space (e.g.,
test_space
). - Short description: Provide a brief description of the space. Here, we’ll use
test_description
to summarize its purpose.
Then, we’ll choose a License for the space, which defines usage permissions. We can select from options (e.g., MIT or Apache-2.0) based on our project’s needs. Here, we’ll go with the MIT
license.
Next, we choose the Space SDK (app template) to set the type of app we’ll create. Available options include Gradio, Streamlit, and Static. Since we’ll be using a Gradio interface, we’ll select Gradio
as the template.
In the Space Hardware section, we select the hardware environment for our model:
- CPU for smaller models or lighter tasks.
- GPU or TPU for enhanced performance with heavier models.
For this setup, we’ll use the free CPU option with basic hardware: a 2-core CPU and 16 GB of RAM, sufficient for a lightweight PaliGemma model.
Finally, we set the Visibility of the Space. Choosing Public
allows others to interact with your model, while Private
restricts access.
After filling in these options, we click Create Space
to finalize. In the following figure, we can see the above steps.
With these steps, we’ve successfully created a Gradio Space. The URL for our newly created Space will look like this: https://huggingface.co/spaces/pyimagesearch/test_space.
Creating Files in Hugging Face Spaces
Now that our Space is set up, we can add the code to build our app. For this example, we’ll use the Visual Question Answering code from the tutorial, Vision-Language Model: PaliGemma for Image Description Generator and More.
To set up the code, we need two files:
- requirements.txt: Here, we’ll specify the Python dependencies our app requires.
- app.py: This file will contain the main app logic.
Once both files are created and populated, the Space will automatically start downloading dependencies, and then build and launch our app.
To create these files directly in Spaces:
Navigate to the “Files” tab:
- This tab is available in the newly created Space’s top header.
- Here, we can see an option to add files directly.
Create each file:
- Click Add File, then choose Create new file.
- Enter the filename (
requirements.txt
orapp.py
) in the “File name” field. - Add the code or dependencies needed in the editor box provided.
- Finally, commit the newly created file.
Alternatively, if we prefer to work locally and push updates:
- Clone the Space: Use the terminal command
git clone <Space URL>
to clone it to our local system. - Edit the files locally in our preferred IDE (integrated development environment).
- Push changes back to Hugging Face Spaces with
git add .
,git commit -m "Add app files"
, andgit push
.
Adding Code to the Files
requirements.txt
In requirements.txt
, we list the libraries needed for the Visual Question Answering task, such as:
transformers torch peft bitsandbytes gradio
app.py
In app.py
, we add the main logic for our Gradio app, including the model loading, inference function, and interface setup.
import gradio as gr from transformers import AutoProcessor, PaliGemmaForConditionalGeneration import torch # Load model and processor model_id = "pyimagesearch/finetuned_paligemma_vqav2_small" model = PaliGemmaForConditionalGeneration.from_pretrained(model_id) processor = AutoProcessor.from_pretrained("google/paligemma-3b-pt-224") # Define inference function def process_image(image, prompt): # Process the image and prompt using the processor inputs = processor(image.convert("RGB"), prompt, return_tensors="pt") try: # Generate output from the model output = model.generate(**inputs, max_new_tokens=20) # Decode and return the output decoded_output = processor.decode(output[0], skip_special_tokens=True) # Return the answer (exclude the prompt part from output) return decoded_output[len(prompt):] except IndexError as e: print(f"IndexError: {e}") return "An error occurred during processing." # Define the Gradio interface inputs = [ gr.Image(type="pil"), gr.Textbox(label="Prompt", placeholder="Enter your question") ] outputs = gr.Textbox(label="Answer") # Create the Gradio app demo = gr.Interface(fn=process_image, inputs=inputs, outputs=outputs, title="Visual Question Answering with Fine-tuned PaliGemma Model", description="Upload an image and ask questions to get answers.") # Launch the app demo.launch()
Finalizing the App
Once both files are saved, Spaces will automatically install the dependencies from requirements.txt
, build the app, and launch it for users to interact with. The app will now be accessible at the Space’s URL for anyone to try!
In the figure below, we can see the Spaces demo for the Visual Question Answering task.
Check out each task in action on Hugging Face Spaces:
- Visual Question Answering on Hugging Face Spaces
- Document Understanding on Hugging Face Spaces
- Image Captioning on Hugging Face Spaces
- Video Captioning on Hugging Face Spaces
What's next? We recommend PyImageSearch University.
86 total classes • 115+ hours of on-demand code walkthrough videos • Last updated: October 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
Hugging Face Spaces provides a powerful platform to deploy and share machine learning applications with an interactive, browser-based interface. This tutorial demonstrates how to use Gradio to deploy the PaliGemma model on Spaces, enabling real-time user interaction with the model. By the end, you’ll have a live, accessible ML demo that showcases your model’s capabilities effectively.
As an added benefit, Gradio Spaces can be embedded into external websites, allowing seamless integration into your existing platforms. For more information on embedding Gradio Spaces, you can explore this link.
Citation Information
Thakur, P. “Deploy Gradio Apps on Hugging Face Spaces,” PyImageSearch, P. Chugh, S. Huot, and G. Kudriavtsev, eds., 2024, https://pyimg.co/lkuqi
@incollection{Thakur_2024_deploy-gradio-apps-on-hugging-face-spaces, author = {Piyush Thakur}, title = {{Deploy Gradio Apps on Hugging Face Spaces}}, booktitle = {PyImageSearch}, editor = {Puneet Chugh and Susan Huot and Georgii Kudriavtsev}, year = {2024}, url = {https://pyimg.co/lkuqi}, }
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.