BLIP Image Captioning Transformer Models

End-to-end image captioning systems using BLIP models, including web interfaces, fine-tuning, batch processing, and caption generation. Does NOT include general vision-language models, CLIP embeddings, or non-captioning vision tasks like classification or object detection.

There are 25 blip image captioning models tracked. 1 score above 50 (established tier). The highest-rated is label-sleuth/label-sleuth at 54/100 with 271 stars and 146 monthly downloads.

Get all 25 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=blip-image-captioning&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 label-sleuth/label-sleuth

Open source no-code system for text annotation and building of text classifiers

54
Established
2 CVHub520/X-AnyLabeling-Server

A Simple, Lightweight, and Extensible Serving Framework for X-AnyLabeling

47
Emerging
3 antoninodimaggio/Hugging-Captions

Generate realistic Instagram captions using transformers 🤗

27
Experimental
4 VisioSphereAI/labelvim

This is a python based standalone image annotation tool designed for tasks...

25
Experimental
5 hem9984/Dataset-label

This will allow you to choose your labels, and then label every image in a...

24
Experimental
6 Merserk/Caption-Creator

Caption Creator is a fast and portable tool for generating high-quality...

23
Experimental
7 FuxiaoLiu/VisualNews-Repository

[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning

21
Experimental
8 tharun-ship-it/image-to-text-generator

🖼️BLIP-powered Image-to-Text Generator achieving 136.7 CIDEr score on...

19
Experimental
9 dmdin/SceneDescriptor

🎞 Video editor with description generation for MTS TrueTech Hack

19
Experimental
10 d-senyaka/pix-scribe

AI Image-to-Story Generator. Upload an image → describe it using BLIP Then,...

17
Experimental
11 eray-yuztyurk/python-ai-image-captioning

AI-powered image captioning app using the BLIP model. Instantly generate...

17
Experimental
12 GregoryKogan/gigachat-vision-task

Тестовое задание в команду GigaChat Vision

16
Experimental
13 Asimo-o/blipren_release

🚀 Train any LLM with BLIPren, a flexible architecture that adapts to your...

16
Experimental
14 filipe-braiman/cv-aircraft-inspection

A deep‑learning system that classifies aircraft surface damage as dent or...

15
Experimental
15 mahalrs/newsgen

Multi-Modal Image Generation for News Stories

15
Experimental
16 mozartsempiano/psykos

Bot that fetches random images from Tumblr, analyzes their aesthetics, and...

15
Experimental
17 MNJMARIA/blip_image_captioning

BLIP image captioning model with Gradio interface. Upload an image and get...

14
Experimental
18 ash-01xor/Imgcap

A CLI to generate captions for images

12
Experimental
19 ShrijithSM/Image-Captioning-AI

An image captioning app using BLIP and Gradio to generate AI-based captions...

12
Experimental
20 devtitus/Image-Caption-with-Pretrained-model

A simple yet powerful image captioning application that uses Salesforce's...

11
Experimental
21 enigmatronix13/Neural-Style-Transfer

Flask-based web app that performs Neural Style Transfer (NST) using...

11
Experimental
22 ai-art-dev99/vision-language-caption-vqa

End-to-end BLIP + LLaVA project for image captioning and VQA with...

11
Experimental
23 spongedsc/SpongeML

SocketIO server providing a CharacterAI proxy and Image Captioning

11
Experimental
24 coffeedrunkpanda/multimodal-api

A FastAPI service that leverages BLIP-2 transformer models for image...

11
Experimental
25 eren23/blipren_release

BLIP-2 implementation for training vision-language models. Q-Former + frozen...

10
Experimental