raj-tyagi/4CLIP-Image-Captioning
This repository presents 4CLIP, a novel approach to image captioning that enhances traditional models by dividing images into four quadrants and processing them individually. By leveraging a pretrained ViT-GPT2 model from Hugging Face, 4CLIP generates more detailed and comprehensive captions, making it suitable for fine-grained visual tasks.
No commits in the last 6 months.
Stars
—
Forks
2
Language
Python
License
MIT
Category
Last pushed
Sep 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/raj-tyagi/4CLIP-Image-Captioning"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ntrang086/image_captioning
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common...
fregu856/CS224n_project
Neural Image Captioning in TensorFlow.
vacancy/SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic...
ltguo19/VSUA-Captioning
Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019
Abdelrhman-Yasser/video-content-description
Video content description model for generating descriptions for unconstrained videos