marqo-ai/marqo-ecommerce-embeddings
State-of-the-art embedding models fine-tuned for the ecommerce domain. +67% increase in evaluation metrics vs ViT-B-16-SigLIP.
Built on OpenCLIP with multimodal vision-language architecture, these models enable simultaneous text-to-image and image-to-text retrieval for product search. Two size variants (203M and 652M parameters) support deployment flexibility, with inference optimized for GPU acceleration (5-11ms per batch on A10G). Includes evaluation datasets (GoogleShopping-1m, AmazonProducts-3m) and integrates with both HuggingFace Transformers and OpenCLIP loaders for seamless adoption.
No commits in the last 6 months.
Stars
45
Forks
2
Language
Python
License
—
Category
Last pushed
Nov 13, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/marqo-ai/marqo-ecommerce-embeddings"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ssrajadh/sentrysearch
Semantic search over videos using Gemini Embedding 2.
hayabhay/frogbase
Transform audio-visual content into navigable knowledge.
zilliz-bootcamp/audio_search
This project use PANNs for audio tagging and sound event detection, and finally get audio...
kyegomez/Pegasus
PegasusX: The Future of Multimodal Embeddings 🦄 🦄
ashvardanian/SwiftSemanticSearch
Real-time on-device text-to-image and image-to-image Semantic Search with video stream camera...