unum-cloud/UForm
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and ๐ video, up to 5x faster than OpenAI CLIP and LLaVA ๐ผ๏ธ & ๐๏ธ
Based on the README, here's a technical summary: Combines Matryoshka-style embeddings down to 64 dimensions with quantization-aware training for fast semantic search via USearch, while generative models leverage ViT encoders paired with compact language models (Qwen, LLaMA) for image captioning and VQA. Exports native ONNX models with bfloat16 support across Python, JavaScript, and Swift for edge deployment from servers to mobile devices.
1,221 stars and 2,280 monthly downloads. Available on PyPI.
Stars
1,221
Forks
76
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 30, 2025
Monthly downloads
2,280
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/unum-cloud/UForm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
mazzzystar/Queryable
Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.
Ubaida-M-Yusuf/Makimus-AI
AI-powered media search โ find images and videos using natural language or visual queries
s-emanuilov/litepali
LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing,...
HEGOM61ita/OffGallery
Catalogatore AI di immagini fotografiche ยท Compatibile con Lightroom โ Tag automatici, Metadata...