MichiganNLP/Scalable-VLM-Probing

Probe Vision-Language Models

32
/ 100
Emerging

This project helps AI researchers and practitioners evaluate how well vision-language models (VLMs) like CLIP understand the relationship between images and text. It takes an existing dataset of image-sentence pairs and VLM output scores, then correlates these scores with various linguistic features to identify what the model is actually 'seeing' or 'understanding'. You would use this if you are developing or applying VLMs and need to understand their semantic strengths and weaknesses without extensive manual annotation.

No commits in the last 6 months.

Use this if you want to gain deeper insights into why a vision-language model performs well or poorly on specific image-text combinations by analyzing linguistic patterns.

Not ideal if you are looking for a tool to train new vision-language models or for a general-purpose image classification or captioning application.

AI-evaluation model-interpretability natural-language-processing computer-vision semantic-analysis
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 12 / 25

How are scores calculated?

Stars

5

Forks

1

Language

Python

License

Apache-2.0

Last pushed

Jul 27, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/MichiganNLP/Scalable-VLM-Probing"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.