dsdlt/mongodb-scalable-document-embeddings

Generate embeddings at scale using MongoDB Atlas Stream Processing and MongoDB Atlas Vector Search

13
/ 100
Experimental

This project helps data engineers or platform teams process vast amounts of unstructured text data, like song lyrics or articles, as it arrives. It takes raw text documents, generates numerical representations (embeddings) that capture their meaning, and stores them in a MongoDB database. This enables powerful semantic search and analysis of the documents.

No commits in the last 6 months.

Use this if you need to continuously process and embed large, streaming volumes of text documents in real-time, making them instantly searchable by meaning.

Not ideal if you only have a small, static set of documents to embed or primarily need simple keyword search rather than semantic understanding.

data-engineering real-time-analytics text-processing semantic-search document-management
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

12

Forks

Language

Python

License

Last pushed

Apr 12, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/dsdlt/mongodb-scalable-document-embeddings"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.