fastembed and open-text-embeddings
FastEmbed is a lightweight embedding library that can be embedded in applications, while Open-Text-Embeddings wraps embedding models in an OpenAI-compatible API server, making them complementary tools for different deployment patterns (in-process vs. remote service).
About fastembed
qdrant/fastembed
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
Leverages ONNX Runtime instead of PyTorch to minimize dependencies and enable deployment in serverless environments like AWS Lambda. Supports dense embeddings, sparse embeddings (SPLADE++), late-interaction models (ColBERT), image embeddings, and cross-encoder reranking—with extensibility for custom models. Integrates directly with Qdrant vector database for end-to-end semantic search workflows.
About open-text-embeddings
rag-wtf/open-text-embeddings
Open Source Text Embedding Models with OpenAI Compatible API
Implements a FastAPI server that wraps HuggingFace sentence-transformer and BGE/E5 models behind an OpenAI-compatible `/embeddings` endpoint, enabling drop-in replacement for OpenAI's embeddings API. Intelligently handles model-specific prefixing strategies—automatically applying query vs. document prefixes based on input type (string vs. list)—critical for optimal performance with instruction-tuned models like BAAI/bge and intfloat/e5 series. Supports both on-premise deployment (CPU/GPU modes) and cloud hosting via AWS Lambda or Modal, with LangChain integration for seamless adoption in RAG pipelines.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work