coffeedrunkpanda/multimodal-api

A FastAPI service that leverages BLIP-2 transformer models for image understanding. Features include automatic image captioning and visual question answering (VQA), all containerized with Docker for easy deployment.

/ 100

Experimental

No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 0 / 25

Maturity 9 / 25

Community 0 / 25

How are scores calculated?

Stars

—

Forks

—

Language

Jupyter Notebook

License

MIT

Category

blip-image-captioning

Last pushed

Oct 03, 2025

Commits (30d)

GitHub

BLIP Image Captioning · 25 models

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/coffeedrunkpanda/multimodal-api"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

label-sleuth/label-sleuth

Open source no-code system for text annotation and building of text classifiers

CVHub520/X-AnyLabeling-Server

A Simple, Lightweight, and Extensible Serving Framework for X-AnyLabeling

antoninodimaggio/Hugging-Captions

Generate realistic Instagram captions using transformers 🤗

FuxiaoLiu/VisualNews-Repository

[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning

VisioSphereAI/labelvim

This is a python based standalone image annotation tool designed for tasks such as image...

Explore Transformer Models

All categories Trending Transformer directory Insights