IMSUVEN/wubba
Wubba learns layout-invariant embeddings from raw HTML using contrastive learning. Convert any HTML document into a fixed-size vector for similarity search, clustering, or classification.
Stars
—
Forks
—
Language
Python
License
—
Category
Last pushed
Feb 11, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/IMSUVEN/wubba"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ContextualAI/gritlm
Generative Representational Instruction Tuning
xlang-ai/instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
liuqidong07/LLMEmb
[AAAI'25 Oral] The official implementation code of LLMEmb
hpcaitech/CachedEmbedding
A memory efficient DLRM training solution using ColossalAI
shobrook/weightgain
Train an adapter for any embedding model in under a minute