AstraBert/ingest-anything
From data to vector database effortlessly
Supports diverse file formats (DOCX, CSV, JSON, XML, code files) and web content through format-specific pipelines: text files convert via PdfItDown before chunking with Chonkie, while code uses semantic-aware CodeChunker. Integrates with LlamaIndex vector stores (Qdrant, Weaviate) and multiple embedding providers (Sentence Transformers, OpenAI, Cohere), plus includes an agentic RAG interface for automated document ingestion and querying workflows.
No commits in the last 6 months. Available on PyPI.
Stars
89
Forks
12
Language
Python
License
MIT
Category
Last pushed
May 17, 2025
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/AstraBert/ingest-anything"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
pixeltable/pixeltable
Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.
activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store,...
superlinked/VectorHub
VectorHub is a free, open-source learning website for people (software developers to senior ML...
hhblaze/DBreeze
C# .NET NOSQL ( key value, object store embedded TextSearch SemanticSearch Vector layer ) ACID...
nitaiaharoni1/vector-storage
Vector Storage is a vector database that enables semantic similarity searches on text documents...