whwu95/GPT4Vis
GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?
Leverages GPT-4's multimodal capabilities for zero-shot classification by generating rich semantic descriptions for class labels and processing visual inputs (images, videos, point clouds) through GPT-4V's vision API. The approach evaluates performance across 16 benchmark datasets spanning three modalities, providing pre-generated prompts and reproducible inference scripts alongside comprehensive results and ground-truth annotations.
185 stars. No commits in the last 6 months.
Stars
185
Forks
18
Language
Python
License
MIT
Category
Last pushed
May 22, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/whwu95/GPT4Vis"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
promptslab/openai-detector
AI classifier for indicating AI-written text
vorniches/snap2txt
Convert your project into a text prompt.
awekrx/ChatGPT-MidJourney-prompt
This is a ChatGPT based prompt generation model for MidJorney. The purpose of this model is to...
narenaryan/Vidura
A beautiful and elegant chat GPT prompt management system
Bradybry/chatXML
A proposal for a structured LLM prompt method.