coffeedrunkpanda/multimodal-api
A FastAPI service that leverages BLIP-2 transformer models for image understanding. Features include automatic image captioning and visual question answering (VQA), all containerized with Docker for easy deployment.
No commits in the last 6 months.
Stars
—
Forks
—
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 03, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/coffeedrunkpanda/multimodal-api"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
label-sleuth/label-sleuth
Open source no-code system for text annotation and building of text classifiers
CVHub520/X-AnyLabeling-Server
A Simple, Lightweight, and Extensible Serving Framework for X-AnyLabeling
antoninodimaggio/Hugging-Captions
Generate realistic Instagram captions using transformers 🤗
FuxiaoLiu/VisualNews-Repository
[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning
VisioSphereAI/labelvim
This is a python based standalone image annotation tool designed for tasks such as image...