Vision Agent Platforms AI Agents
Tools and frameworks for building production AI agents that process visual data from cameras, videos, or images in real-time. Includes multi-modal vision APIs, streaming video analysis, and visual perception systems. Does NOT include general computer vision libraries, image processing utilities, or non-agentic vision applications.
There are 34 vision agent platforms agents tracked. 1 score above 70 (verified tier). The highest-rated is GetStream/Vision-Agents at 88/100 with 7,366 stars and 19,360 monthly downloads. 1 of the top 10 are actively maintained.
Get all 34 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=vision-agent-platforms&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Agent | Score | Tier |
|---|---|---|---|
| 1 |
GetStream/Vision-Agents
Open Vision Agents by Stream. Build Vision Agents quickly with any model or... |
|
Verified |
| 2 |
video-db/videodb-capture-quickstart
Give your agents real time desktop perception. Stream screen, microphone,... |
|
Emerging |
| 3 |
grctest/g3n-fastapi-webcam-docker
Utilizing multiple Gemma 3n agents to analyze webcam footage |
|
Experimental |
| 4 |
Karmacoke/chargen
AI-powered character generator built with React. Create detailed TRPG/Novel... |
|
Experimental |
| 5 |
leukaemiamedtech/hias-tassai-facial-recognition
HIAS TassAI Facial Recognition Agent processes streams from local or remote... |
|
Experimental |
| 6 |
TheSethRose/AI-File-Organizer-Agent
Uses an AI agent (powered by Google Gemini via the Agno framework) to... |
|
Experimental |
| 7 |
Arshveen-singh/Vision-CLI
Please contact me at- Arshveensingh@proton.me |
|
Experimental |
| 8 |
mohammad-oghli/Wildlife-Agentic-Vision
Google Gemini Agentic AI Vision for Wildlife Analytics |
|
Experimental |
| 9 |
SSusantAchary/video-point-tracker
Local-first multimodal video point tracking tool for people monitoring,... |
|
Experimental |
| 10 |
jhhfut/eyefriend
Real-time AI vision assistant for visually impaired users. |
|
Experimental |
| 11 |
Eatosin/Structura
Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON... |
|
Experimental |
| 12 |
eric-ai-lab/Screen-Point-and-Read
Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with... |
|
Experimental |
| 13 |
rupac4530-creator/vision-agent
Production-grade multi-modal AI platform — 17 real-time vision & audio tabs,... |
|
Experimental |
| 14 |
prodev717/vitaura
AI-powered civic issue management system that classifies citizen reports... |
|
Experimental |
| 15 |
sijeeshmiziha/visionagent
Multi-provider AI agent framework with vision capabilities and tool calling.... |
|
Experimental |
| 16 |
yusef1975/SortAI
SortAI: is a minimalist desktop automation tool designed for students and... |
|
Experimental |
| 17 |
Linda5823/Magic-Point-to-Read-V3
🪄 Magic Point-to-Read: An interactive AI reading assistant using Google... |
|
Experimental |
| 18 |
Senju14/focus-bounty-ai
An autonomous AI Agent that uses Computer Vision and LLM reasoning to... |
|
Experimental |
| 19 |
LatinScribe/siloam-public
Siloam helps visually impaired individuals gain real-time awareness of their... |
|
Experimental |
| 20 |
KazKozDev/vision-agent-analyst
Vision Agent Analyst is a professional web application for automatic... |
|
Experimental |
| 21 |
nrbnayon/Stream-Lab
Stream Lab lets you watch movies online anytime, anywhere. Create your own... |
|
Experimental |
| 22 |
GPTBOTS/gptbots-chrome-extension
Based on the LINE WEB backend, it provides features such as automatic... |
|
Experimental |
| 23 |
neurobot-ai/neurobot-vision
Train and validate computer vision models with integrated tools supporting... |
|
Experimental |
| 24 |
imediacorp/file-organizer
Open-source AI-powered file organizer with scientific rigour. Built by... |
|
Experimental |
| 25 |
willytop8/Live-Environment-Streams
A master GeoJSON repository of 1,500+ live outdoor webcams globally.... |
|
Experimental |
| 26 |
burgerman/vision_ai_insight
Vision AI-high-security and smart image analysis |
|
Experimental |
| 27 |
Chihuah/AgentsThinkWrite
使用 GPT 建立自訂角色、分工生成與合成審稿流程的範例。Example of GPT-driven role-based prompt... |
|
Experimental |
| 28 |
Dewiin/blind-spot
CUNY Tech Prep 2025 Project |
|
Experimental |
| 29 |
biswajit-debnath/IntelliJournal
Modern journal app powered by OpenAI GPT-4, featuring real-time analysis,... |
|
Experimental |
| 30 |
Mahboob-A/drishti-ai
Eye Disease Detection Using Vision Agents | https://youtu.be/8LUT89UYnSc |
|
Experimental |
| 31 |
sarveshgupta89/pots_image_extractor
pots_image_extractor |
|
Experimental |
| 32 |
techySPHINX/AetheriaScribe
Unleash your imagination: Next.js streams dynamic tales crafted by Gemini AI... |
|
Experimental |
| 33 |
burgerman/robotics_skills
Vision AI agent powered by VLM |
|
Experimental |
| 34 |
mikhailusov/askGPT
AskGPT is a real-time AI assistant that enhances your calls, interviews,... |
|
Experimental |