hugging face Archives - PyImageSearch

SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA

June 23, 2025

Table of Contents SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA SmolVLM 1: A Compact Yet Capable Vision-Language Model What Is SmolVLM? Why SmolVLM? The Three Variants of SmolVLM Architecture Overview Vision Encoder: SigLIP Variants Pixel Shuffle (Space-to-Depth) for Image…

Read More of SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA

Gradio

Hugging Face Spaces

Interactive Applications

Machine Learning Deployment

Tutorial

Deploy Gradio Apps on Hugging Face Spaces

December 30, 2024

Table of Contents Deploy Gradio Apps on Hugging Face Spaces What Is Hugging Face Spaces? Setup Creating Files in Hugging Face Spaces Adding Code to the Files requirements.txt app.py Finalizing the App Summary Citation Information Deploy Gradio Apps on Hugging…

Read More of Deploy Gradio Apps on Hugging Face Spaces

Vision-Language Model

Fine Tune PaliGemma with QLoRA for Visual Question Answering

December 2, 2024

Table of Contents Fine Tune PaliGemma with QLoRA for Visual Question Answering What Is PaliGemma? What Is a Vision-Language Model? Architecture of PaliGemma How Is PaliGemma Trained? Available Model Checkpoints Use Cases of PaliGemma Why PaliGemma? Inference with PaliGemma Setup…

Read More of Fine Tune PaliGemma with QLoRA for Visual Question Answering

Advanced AI Configurations

AI in Healthcare

AI Tool Integration

AI Training and Inference

Edge Computing with AI

Fine-Tuning Models

Foundational Models

Large Language Models

LLM Configuration

Local LLM Frameworks

Text Generation Web UI

Tutorial

Exploring Oobabooga Text Generation Web UI: Installation, Features, and Fine-Tuning Llama Model with LoRA

July 1, 2024

Table of Contents Exploring Oobabooga Text Generation Web UI: Installation, Features, and Fine-Tuning Llama Model with LoRA Introduction What’s in Store for You? Overview of Oobabooga Text Generation Web UI Interface Overview User Interaction Model Response Action Buttons Why Is…

Read More of Exploring Oobabooga Text Generation Web UI: Installation, Features, and Fine-Tuning Llama Model with LoRA

Computer Vision

Machine Learning

Optical Character Recognition

Traffic Monitoring

Web Applications

Automatic License Plate Reader Using OCR in Python

June 10, 2024

Table of Contents Automatic License Plate Reader Using OCR in Python License Plate Reader A Small Survey of License Plate Reader Methods Modern-Day Object Detectors Owlv2 PaddleOCR Architecture of PaddleOCR Configuring Your Development Environment Setup and Imports Object Detection OCR…

Read More of Automatic License Plate Reader Using OCR in Python

Artificial Intelligence

Sharpen Your Vision: Super-Resolution of CCTV Images Using Hugging Face Diffusers

June 3, 2024

Table of Contents Sharpen Your Vision: Super-Resolution of CCTV Images Using Hugging Face Diffusers Configuring Your Development Environment Problem Statement How Does Super-Resolution Solve This? State-of-the-Art Approaches Generative Adversarial Networks (GANs) Diffusion Models Implementing Diffusion-Based Upscaler Using Hugging Face 🤗…

Read More of Sharpen Your Vision: Super-Resolution of CCTV Images Using Hugging Face Diffusers

Artificial Intelligence

Canny

Computer Vision

ControlNet

ControlNet Conditioning

Segment Anything Model (SAM)

Text-to-Image

Tutorial

Understanding Tasks in Diffusers: Part 3

May 27, 2024

Table of Contents Understanding Tasks in Diffusers: Part 3 Introduction Why Not Image-to-Image? ControlNet Models Configuring Your Development Environment Setup and Imports Installation Imports Utility Functions Canny ControlNet Setting Up Loading the Model Optimizing the Pipeline Image Generation Cleaning Up…

Read More of Understanding Tasks in Diffusers: Part 3

Large Language Models

Local Large Language Models

Model Management

Software Installation

Tutorial

Inside Look: Exploring Ollama for On-Device AI

May 20, 2024

Table of Contents Inside Look: Exploring Ollama for On-Device AI Introduction to Ollama Overview of Ollama Installing Ollama on a MacOS Ollama’s Model Registry: A Treasure Trove of LLMs Ollama as a Command Line Interface Tool History and Contextual Awareness…

Read More of Inside Look: Exploring Ollama for On-Device AI

Artificial Intelligence

Segment Anything Model (SAM)

Text-to-Image

Tutorial

Step-by-Step Guide to Open-Source Implementation of Generative Fill: Part 2

March 25, 2024

Table of Contents Step-by-Step Guide to Open-Source Implementation of Generative Fill: Part 2 Configuring Your Development Environment Need Help Configuring Your Development Environment? Implementation Essential Setup: Libraries and Imports for Generative Image Editing How to Prompt Language Models for Image…

Read More of Step-by-Step Guide to Open-Source Implementation of Generative Fill: Part 2

Previous Page
Page 1
Page 2
Next Page

SmolVLM to SmolVLM2: Compact Models for Multi-Image VQA

Deploy Gradio Apps on Hugging Face Spaces

Fine Tune PaliGemma with QLoRA for Visual Question Answering

Exploring Oobabooga Text Generation Web UI: Installation, Features, and Fine-Tuning Llama Model with LoRA

Automatic License Plate Reader Using OCR in Python

Sharpen Your Vision: Super-Resolution of CCTV Images Using Hugging Face Diffusers

Understanding Tasks in Diffusers: Part 3

Inside Look: Exploring Ollama for On-Device AI

Step-by-Step Guide to Open-Source Implementation of Generative Fill: Part 2

Topics

Books & Courses

PyImageSearch

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch