rom1504/clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them

56
/ 100
Established

# Technical Summary Provides modular components for CLIP-based semantic search at scale: high-speed inference (1500 samples/s on 3080), efficient vector indexing via FAISS, and a Flask backend with Python client API for remote querying. The pipeline integrates with img2dataset for data acquisition and supports filtering, deduplication, and safety/aesthetic scoring on retrieved results. Designed for billion-scale deployment with end-to-end orchestration from raw image URLs through indexed retrieval UI.

2,733 stars. No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 2 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

2,733

Forks

240

Language

Jupyter Notebook

License

MIT

Last pushed

Aug 15, 2025

Commits (30d)

0

Dependencies

28

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/rom1504/clip-retrieval"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.