Embeddings Are Easier Than Whatever You're Doing Instead
You're writing regex, tuning Elasticsearch, or building keyword indexes. Meanwhile, 20 lines of code and a $0.02 API call would give you better results. Here's how to stop overcomplicating it.
Every week, developers build elaborate keyword search systems, tune Elasticsearch clusters, or write increasingly brittle regex pipelines to match text. And every week, they could have replaced the whole thing with an embedding API call and a Postgres query.
Embeddings sound like ML research. They're not. In 2026, generating an embedding is an API call that costs fractions of a cent. Storing and searching embeddings is a Postgres extension. The entire workflow is 20 lines of code and the results are dramatically better than keyword matching for almost any text similarity, search, or classification task.
If you're building anything that involves matching, searching, or comparing text and you're not using embeddings, you're working too hard.
What embeddings actually are (in 30 seconds)
An embedding turns text into a list of numbers — a vector. Similar text produces similar vectors. "government infrastructure tender" and "public works construction bid" have almost identical embeddings even though they share zero keywords. That's the magic: semantic similarity without keyword matching.
You generate an embedding with one API call:
response = openai.embeddings.create(
model="text-embedding-3-small",
input="your text here"
)
vector = response.data[0].embedding # list of 1536 floats
That's it. No model training, no data pipeline, no GPU. One API call, one vector, less than $0.00002.
The storage question: just use Postgres
The vector database market wants you to believe you need specialised infrastructure. For most applications, you don't.
pgvector is a Postgres extension that adds vector column types and similarity search. If you already run Postgres (and you probably do), you add the extension, create a column, and query with ORDER BY embedding <=> query_vector LIMIT 10. No new database, no new ops burden, no new vendor.
The honest breakdown:
| Option | When to use | When not to |
|---|---|---|
| pgvector in Postgres | Under ~1M vectors, you already use Postgres, you want simplicity | Billions of vectors, need sub-millisecond latency at scale |
| SQLite + extensions | Local-first apps, prototyping, single-user tools | Concurrent writes, production multi-user |
| Dedicated vector DB | 10M+ vectors, need advanced filtering + vector search, performance-critical | Under 1M vectors (overkill), tight budgets (another service to run) |
The threshold is roughly 1 million vectors. Below that, pgvector in your existing database is simpler, cheaper, and fast enough. Above that, or if you need complex filtered vector search at scale, the dedicated vector databases earn their keep:
| Project | Score | Stars | What it does |
|---|---|---|---|
| qdrant | 94/100 | 29,544 | Qdrant - High-performance, massive-scale Vector Database and Vector Search... |
| chroma | 94/100 | 26,607 | Open-source search and retrieval database for AI applications. |
| weaviate | 94/100 | 15,793 | Weaviate is an open-source vector database that stores both objects and... |
| lancedb | 94/100 | 9,425 | Developer-friendly OSS embedded retrieval library for multimodal AI. Search... |
All four score 87/100 — these are well-maintained, well-adopted projects. Qdrant and Weaviate are the production-grade options. Chroma is the developer-friendly choice (great for prototyping, simple API). LanceDB is serverless and embedded — no separate process to run.
But if your dataset is under a million records: pgvector. In the Postgres you already have. Move on.
Which embedding model?
Another area where the decision is simpler than it looks:
If you have an OpenAI API key: Use text-embedding-3-small. It's cheap ($0.02 per million tokens), good quality, and you don't have to run anything. For better quality at 6x the cost, text-embedding-3-large.
If you want to run locally: fastembed (74/100) from Qdrant. Lightweight, fast, runs on CPU. mlx-embeddings (69/100) for Apple Silicon.
If you need the best open-source model: FlagEmbedding (79/100, 11,395 stars) — the BGE embedding models that consistently top the MTEB benchmark (99/100).
If you want a full pipeline, not just embeddings: txtai (91/100) wraps embeddings, indexing, and search into one framework. Good for prototyping end-to-end semantic search without assembling components.
What embeddings replace
This is the part nobody tells you clearly enough. Embeddings don't supplement your existing approach. They replace it, and the replacement is better:
- Keyword search — Embeddings find semantically similar results even when keywords differ. "cheap flights to Paris" matches "budget airfare France." No synonym dictionaries, no stemming rules, no query expansion. Just vectors.
- Regex matching — If you're writing regex to classify or route text, embeddings do it better with less code. Embed your categories, embed the input, take the closest match. Works across languages too.
- Elasticsearch / Solr — For most teams, the operational cost of running an Elasticsearch cluster far exceeds what they get from it. pgvector + embeddings gives better relevance with zero additional infrastructure.
- TF-IDF / BM25 — These are good algorithms that embeddings have surpassed. bm25s (80/100) is an excellent BM25 implementation if you need it as a baseline or hybrid, but pure embedding search outperforms it on most benchmarks.
The cost reality
The biggest misconception about embeddings is that they're expensive. They were in 2023. In 2026:
- Embedding 100,000 documents: ~$2 with OpenAI's small model. One-time cost.
- Embedding each search query: ~$0.00002. Essentially free.
- Storage: 1M vectors at 1536 dimensions = ~6GB. Fits in a standard Postgres instance.
- Compute: pgvector similarity search over 1M vectors takes milliseconds on commodity hardware.
Compare this to running an Elasticsearch cluster ($50-500/month), maintaining a keyword search pipeline (ongoing engineering time), or the developer hours spent writing and debugging regex patterns. Embeddings are cheaper on every dimension.
Getting started in 15 minutes
The practical path:
- Get an embedding API key. OpenAI, Cohere, or Voyage. Or install fastembed for local.
- Add pgvector to your Postgres.
CREATE EXTENSION vector;One line. - Create a column.
ALTER TABLE documents ADD COLUMN embedding vector(1536); - Embed your data. Loop through your rows, call the API, store the vector.
- Search.
SELECT * FROM documents ORDER BY embedding <=> $query_vector LIMIT 10;
That's the entire implementation. No ML pipeline, no model training, no GPU, no Kubernetes. If you've been putting off "adding AI" to your application because it seemed complex, this is the shortcut.
Go deeper
Every project mentioned here has a quality-scored page in our directory, updated daily:
- All 113 embedding categories — from sentence transformers to graph embeddings to multimodal
- Vector database categories — Qdrant, Chroma, Weaviate, pgvector ecosystem, and more
- Trending embedding projects — what's moving this week
- RAG categories — where embeddings meet retrieval-augmented generation
Related analysis
We Audited crewAI's AI Dependencies: Here's What the Data Says
47 dependencies, 8 scored against 220K repos. World-class choices across the board — and the one category where the...
You're Shipping AI You Can't Measure
1,159 repos are building LLM evaluation infrastructure. Most teams are still eyeballing outputs. Here's the decision...
Agent Memory in 2026: What Actually Works for Persistent AI
977 repos, 5 domains, 10+ names for the same concept. A decision guide for builders navigating the most fragmented...
AI Agents for Obsidian and Personal Knowledge: What Actually Works in 2026
From established plugins to MCP bridges to experimental agent tools — scored on quality daily. A decision guide for...