bukosabino/justicio
Building an assistant for Boletin Oficial del Estado (BOE) using Retrieval Augmented Generation (RAG)
Embeds BOE documents into a vector database using Spanish-tuned sentence transformers, then retrieves semantically similar articles via approximate nearest neighbor search to provide LLM context. Built on FastAPI, Langchain, and Qdrant, with daily ETL pipelines that chunk documents, generate embeddings, and store metadata for hybrid search capabilities (semantic, keyword, or combined). Supports multiple similarity metrics and operates a free public service while maintaining deployment flexibility for self-hosted instances.
138 stars. No commits in the last 6 months.
Stars
138
Forks
43
Language
HTML
License
MIT
Category
Last pushed
Jul 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/bukosabino/justicio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
MohammadForouhesh/tracking-policy-agendas
This repository contains the implementation for Tracking Legislators’ Expressed Policy Agendas...
RicardoMoya/NLP_with_Python
En este proyecto de GitHhub podrás encontrar parte del material que utilizo para impartir las...
josejesusguzman/meetup-github-chatgpt-pln
Clase de fundamentos de procesamiento de lenguaje natural
teticio/aventuras-con-textos
Notebooks for classes in Spanish and English on cutting edge end-to-end NLP (Natural Language...
lgomezt/Aprendizaje_No_Supervisado
Introducción al Aprendizaje No Supervisado en Español