Vision Agent Platforms AI Agents

Tools and frameworks for building production AI agents that process visual data from cameras, videos, or images in real-time. Includes multi-modal vision APIs, streaming video analysis, and visual perception systems. Does NOT include general computer vision libraries, image processing utilities, or non-agentic vision applications.

There are 34 vision agent platforms agents tracked. 1 score above 70 (verified tier). The highest-rated is GetStream/Vision-Agents at 88/100 with 7,366 stars and 19,360 monthly downloads. 1 of the top 10 are actively maintained.

Get all 34 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=vision-agent-platforms&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Agent Score Tier
1 GetStream/Vision-Agents

Open Vision Agents by Stream. Build Vision Agents quickly with any model or...

88
Verified
2 video-db/videodb-capture-quickstart

Give your agents real time desktop perception. Stream screen, microphone,...

42
Emerging
3 grctest/g3n-fastapi-webcam-docker

Utilizing multiple Gemma 3n agents to analyze webcam footage

29
Experimental
4 Karmacoke/chargen

AI-powered character generator built with React. Create detailed TRPG/Novel...

28
Experimental
5 leukaemiamedtech/hias-tassai-facial-recognition

HIAS TassAI Facial Recognition Agent processes streams from local or remote...

26
Experimental
6 TheSethRose/AI-File-Organizer-Agent

Uses an AI agent (powered by Google Gemini via the Agno framework) to...

24
Experimental
7 Arshveen-singh/Vision-CLI

Please contact me at- Arshveensingh@proton.me

23
Experimental
8 mohammad-oghli/Wildlife-Agentic-Vision

Google Gemini Agentic AI Vision for Wildlife Analytics

22
Experimental
9 SSusantAchary/video-point-tracker

Local-first multimodal video point tracking tool for people monitoring,...

22
Experimental
10 jhhfut/eyefriend

Real-time AI vision assistant for visually impaired users.

22
Experimental
11 Eatosin/Structura

Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON...

21
Experimental
12 eric-ai-lab/Screen-Point-and-Read

Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with...

20
Experimental
13 rupac4530-creator/vision-agent

Production-grade multi-modal AI platform — 17 real-time vision & audio tabs,...

20
Experimental
14 prodev717/vitaura

AI-powered civic issue management system that classifies citizen reports...

20
Experimental
15 sijeeshmiziha/visionagent

Multi-provider AI agent framework with vision capabilities and tool calling....

20
Experimental
16 yusef1975/SortAI

SortAI: is a minimalist desktop automation tool designed for students and...

19
Experimental
17 Linda5823/Magic-Point-to-Read-V3

🪄 Magic Point-to-Read: An interactive AI reading assistant using Google...

19
Experimental
18 Senju14/focus-bounty-ai

An autonomous AI Agent that uses Computer Vision and LLM reasoning to...

19
Experimental
19 LatinScribe/siloam-public

Siloam helps visually impaired individuals gain real-time awareness of their...

18
Experimental
20 KazKozDev/vision-agent-analyst

Vision Agent Analyst is a professional web application for automatic...

17
Experimental
21 nrbnayon/Stream-Lab

Stream Lab lets you watch movies online anytime, anywhere. Create your own...

17
Experimental
22 GPTBOTS/gptbots-chrome-extension

Based on the LINE WEB backend, it provides features such as automatic...

17
Experimental
23 neurobot-ai/neurobot-vision

Train and validate computer vision models with integrated tools supporting...

16
Experimental
24 imediacorp/file-organizer

Open-source AI-powered file organizer with scientific rigour. Built by...

15
Experimental
25 willytop8/Live-Environment-Streams

A master GeoJSON repository of 1,500+ live outdoor webcams globally....

14
Experimental
26 burgerman/vision_ai_insight

Vision AI-high-security and smart image analysis

14
Experimental
27 Chihuah/AgentsThinkWrite

使用 GPT 建立自訂角色、分工生成與合成審稿流程的範例。Example of GPT-driven role-based prompt...

13
Experimental
28 Dewiin/blind-spot

CUNY Tech Prep 2025 Project

13
Experimental
29 biswajit-debnath/IntelliJournal

Modern journal app powered by OpenAI GPT-4, featuring real-time analysis,...

12
Experimental
30 Mahboob-A/drishti-ai

Eye Disease Detection Using Vision Agents | https://youtu.be/8LUT89UYnSc

12
Experimental
31 sarveshgupta89/pots_image_extractor

pots_image_extractor

11
Experimental
32 techySPHINX/AetheriaScribe

Unleash your imagination: Next.js streams dynamic tales crafted by Gemini AI...

11
Experimental
33 burgerman/robotics_skills

Vision AI agent powered by VLM

11
Experimental
34 mikhailusov/askGPT

AskGPT is a real-time AI assistant that enhances your calls, interviews,...

11
Experimental