PhantomInsights/mexican-government-report
Text Mining on the 2019 Mexican Government Report, covering from extracting text from a PDF file to plotting the results.
Implements a complete ETL pipeline using **PyPDF2** for PDF text extraction with character encoding correction, **spaCy's Spanish NLP model** for tokenization and named entity recognition, and outputs structured CSV datasets for downstream analysis. Performs sentiment analysis on sentences using Kaggle's Spanish lexicon, then visualizes patterns through **matplotlib/seaborn** plots and geographic distributions via **geopandas**.
476 stars. No commits in the last 6 months.
Stars
476
Forks
82
Language
Python
License
MIT
Category
Last pushed
Jan 22, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/PhantomInsights/mexican-government-report"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
AutoViML/featurewiz_polars
New Polars implementation of the classic featurewiz MRMR algorithm. Created by Ram Seshadri....
gyunggyung/National-Petition
청와대 국민청원 분석으로 국민의 생각 알아보기 📈🔬
stdlib-js/datasets-sotu
State of the Union addresses by U.S. Presidents.
AndreCNF/polids
Analysis of electoral manifestos and output of it through apps.
NLP-UMUTeam/Spanish-PoliCorpus-2020
This dataset contains the code of the paper entitled Predicting Political Ideology from...