shibing624/similarities

Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。

66
/ 100
Established

Implements multiple semantic matching architectures including CoSENT and CLIP models with support for various embedding-based search backends (Faiss, Annoy, HNSW) optimized for billion-scale retrieval. Provides unified APIs for computing text, image, and cross-modal similarities using pre-trained transformer models from Hugging Face, with additional literal matching methods (BM25, Word2Vec, SimHash) for cold-start scenarios. Includes CLI tools, FastAPI backend services, and Gradio frontends for production deployment of search and clustering pipelines.

899 stars and 404 monthly downloads. Actively maintained with 1 commit in the last 30 days. Available on PyPI.

Maintenance 13 / 25
Adoption 16 / 25
Maturity 18 / 25
Community 19 / 25

How are scores calculated?

Stars

899

Forks

90

Language

Python

License

Apache-2.0

Last pushed

Mar 05, 2026

Monthly downloads

404

Commits (30d)

1

Dependencies

7

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/shibing624/similarities"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.