lexy-ai/lexy
Data pipelines for AI applications
Provides document ingestion with configurable cloud storage (S3/GCS), task-based processing via Celery workers, and structured data extraction from unstructured content. Built as a containerized REST API with PostgreSQL persistence and optional embedding integration (OpenAI), accessible through a Python SDK or Swagger interface. Enables modular pipeline construction for RAG, agent context management, and document indexing workflows.
Available on PyPI.
Stars
12
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 02, 2026
Monthly downloads
33
Commits (30d)
0
Dependencies
20
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/lexy-ai/lexy"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
airweave-ai/airweave
Open-source context retrieval layer for AI agents
lotus-data/lotus
AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings....
similigh/simili-bot
AI-powered GitHub issue intelligence - semantic duplicate detection, cross-repo search, and...
superduper-io/superduper
Superduper: End-to-end framework for building custom AI applications and agents.
supabase/headless-vector-search
Supabase Toolkit to perform vector similarity search on your knowledge base embeddings.