Muennighoff/sgpt
SGPT: GPT Sentence Embeddings for Semantic Search
Implements both bi-encoder and cross-encoder architectures for GPT models, using parameter-efficient fine-tuning (BitFit) on bias tensors and position-weighted mean pooling for symmetric/asymmetric search tasks. Integrates with Hugging Face Transformers and Sentence Transformers, with pre-trained models optimized on MS MARCO and NLI datasets evaluated across BEIR and USEB benchmarks. The newer GritLM successor unifies bi-encoders, cross-encoders, and generative capabilities in a single model.
873 stars. No commits in the last 6 months.
Stars
873
Forks
52
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Feb 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/Muennighoff/sgpt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
artitw/text2text
Text2Text Language Modeling Toolkit
Azure-Samples/azure-ai-document-processing-samples
A collection of samples demonstrating techniques for processing documents with Azure AI...
build-on-aws/langchain-embeddings
This repository demonstrates the construction of a state-of-the-art multimodal search engine,...
aiplanethub/beyondllm
Build, evaluate and observe LLM apps
qianniuspace/llm_notebooks
AI 应用示例合集