Embeddings Are Easier Than Whatever You're Doing Instead

You're writing regex, tuning Elasticsearch, or building keyword indexes. Meanwhile, 20 lines of code and a $0.02 API call would give you better results. Here's how to stop overcomplicating it.

Graham Rowe · April 01, 2026 · Updated daily with live data
embeddings vector-db rag

Every week, developers build elaborate keyword search systems, tune Elasticsearch clusters, or write increasingly brittle regex pipelines to match text. And every week, they could have replaced the whole thing with an embedding API call and a Postgres query.

Embeddings sound like ML research. They're not. In 2026, generating an embedding is an API call that costs fractions of a cent. Storing and searching embeddings is a Postgres extension. The entire workflow is 20 lines of code and the results are dramatically better than keyword matching for almost any text similarity, search, or classification task.

If you're building anything that involves matching, searching, or comparing text and you're not using embeddings, you're working too hard.

What embeddings actually are (in 30 seconds)

An embedding turns text into a list of numbers — a vector. Similar text produces similar vectors. "government infrastructure tender" and "public works construction bid" have almost identical embeddings even though they share zero keywords. That's the magic: semantic similarity without keyword matching.

You generate an embedding with one API call:

response = openai.embeddings.create(
    model="text-embedding-3-small",
    input="your text here"
)
vector = response.data[0].embedding  # list of 1536 floats

That's it. No model training, no data pipeline, no GPU. One API call, one vector, less than $0.00002.

The storage question: just use Postgres

The vector database market wants you to believe you need specialised infrastructure. For most applications, you don't.

pgvector is a Postgres extension that adds vector column types and similarity search. If you already run Postgres (and you probably do), you add the extension, create a column, and query with ORDER BY embedding <=> query_vector LIMIT 10. No new database, no new ops burden, no new vendor.

The honest breakdown:

OptionWhen to useWhen not to
pgvector in Postgres Under ~1M vectors, you already use Postgres, you want simplicity Billions of vectors, need sub-millisecond latency at scale
SQLite + extensions Local-first apps, prototyping, single-user tools Concurrent writes, production multi-user
Dedicated vector DB 10M+ vectors, need advanced filtering + vector search, performance-critical Under 1M vectors (overkill), tight budgets (another service to run)

The threshold is roughly 1 million vectors. Below that, pgvector in your existing database is simpler, cheaper, and fast enough. Above that, or if you need complex filtered vector search at scale, the dedicated vector databases earn their keep:

ProjectScoreStarsWhat it does
qdrant 94/100 29,544 Qdrant - High-performance, massive-scale Vector Database and Vector Search...
chroma 94/100 26,607 Open-source search and retrieval database for AI applications.
weaviate 94/100 15,793 Weaviate is an open-source vector database that stores both objects and...
lancedb 94/100 9,425 Developer-friendly OSS embedded retrieval library for multimodal AI. Search...

All four score 87/100 — these are well-maintained, well-adopted projects. Qdrant and Weaviate are the production-grade options. Chroma is the developer-friendly choice (great for prototyping, simple API). LanceDB is serverless and embedded — no separate process to run.

But if your dataset is under a million records: pgvector. In the Postgres you already have. Move on.

Which embedding model?

Another area where the decision is simpler than it looks:

If you have an OpenAI API key: Use text-embedding-3-small. It's cheap ($0.02 per million tokens), good quality, and you don't have to run anything. For better quality at 6x the cost, text-embedding-3-large.

If you want to run locally: fastembed (74/100) from Qdrant. Lightweight, fast, runs on CPU. mlx-embeddings (69/100) for Apple Silicon.

If you need the best open-source model: FlagEmbedding (79/100, 11,395 stars) — the BGE embedding models that consistently top the MTEB benchmark (99/100).

If you want a full pipeline, not just embeddings: txtai (91/100) wraps embeddings, indexing, and search into one framework. Good for prototyping end-to-end semantic search without assembling components.

What embeddings replace

This is the part nobody tells you clearly enough. Embeddings don't supplement your existing approach. They replace it, and the replacement is better:

  • Keyword search — Embeddings find semantically similar results even when keywords differ. "cheap flights to Paris" matches "budget airfare France." No synonym dictionaries, no stemming rules, no query expansion. Just vectors.
  • Regex matching — If you're writing regex to classify or route text, embeddings do it better with less code. Embed your categories, embed the input, take the closest match. Works across languages too.
  • Elasticsearch / Solr — For most teams, the operational cost of running an Elasticsearch cluster far exceeds what they get from it. pgvector + embeddings gives better relevance with zero additional infrastructure.
  • TF-IDF / BM25 — These are good algorithms that embeddings have surpassed. bm25s (80/100) is an excellent BM25 implementation if you need it as a baseline or hybrid, but pure embedding search outperforms it on most benchmarks.

The cost reality

The biggest misconception about embeddings is that they're expensive. They were in 2023. In 2026:

  • Embedding 100,000 documents: ~$2 with OpenAI's small model. One-time cost.
  • Embedding each search query: ~$0.00002. Essentially free.
  • Storage: 1M vectors at 1536 dimensions = ~6GB. Fits in a standard Postgres instance.
  • Compute: pgvector similarity search over 1M vectors takes milliseconds on commodity hardware.

Compare this to running an Elasticsearch cluster ($50-500/month), maintaining a keyword search pipeline (ongoing engineering time), or the developer hours spent writing and debugging regex patterns. Embeddings are cheaper on every dimension.

Getting started in 15 minutes

The practical path:

  1. Get an embedding API key. OpenAI, Cohere, or Voyage. Or install fastembed for local.
  2. Add pgvector to your Postgres. CREATE EXTENSION vector; One line.
  3. Create a column. ALTER TABLE documents ADD COLUMN embedding vector(1536);
  4. Embed your data. Loop through your rows, call the API, store the vector.
  5. Search. SELECT * FROM documents ORDER BY embedding <=> $query_vector LIMIT 10;

That's the entire implementation. No ML pipeline, no model training, no GPU, no Kubernetes. If you've been putting off "adding AI" to your application because it seemed complex, this is the shortcut.

Go deeper

Every project mentioned here has a quality-scored page in our directory, updated daily:

Related analysis