dgarnitz/vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.

42
/ 100
Emerging

Implements fault-tolerant batch processing with RabbitMQ queuing and PostgreSQL job tracking, supporting pluggable embedding models (OpenAI, HuggingFace Sentence Transformers) and vector databases (Pinecone, Qdrant, Weaviate). Provides flexible document chunking strategies (exact, paragraph, sentence, custom) with configurable overlap, and includes a Python client library for programmatic access. Designed for Kubernetes deployment with Docker Compose setup including MinIO object storage and automatic database schema initialization.

698 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

698

Forks

51

Language

Python

License

Apache-2.0

Last pushed

May 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/dgarnitz/vectorflow"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.