AnswerDotAI/byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

63
/ 100
Established

Provides end-to-end multi-modal document retrieval using vision language models that encode both text and images, supporting PDF and image indexing with late-interaction ranking for semantic search. Built on ColPali/ColQwen2 architectures, it handles document-to-page-level retrieval with optional metadata tracking and base64 document storage. Integrates with RAGatouille's API patterns and leverages Flash Attention for GPU-accelerated encoding, with planned support for additional ColVLM models and HNSW indexing.

844 stars and 3,709 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 0 / 25
Adoption 18 / 25
Maturity 25 / 25
Community 20 / 25

How are scores calculated?

Stars

844

Forks

94

Language

Python

License

Apache-2.0

Last pushed

Jan 28, 2025

Monthly downloads

3,709

Commits (30d)

0

Dependencies

8

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/AnswerDotAI/byaldi"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.