juliasilge/tidytext
Text mining using tidy tools :sparkles::page_facing_up::sparkles:
Converts text to one-token-per-row dataframes via `unnest_tokens()`, enabling seamless integration with dplyr, tidyr, and ggplot2 for analysis. Provides built-in sentiment lexicons and stopword lists, plus functions to convert between tidy formats and document-term matrices from the tm package. Leverages the tokenizers package for flexible tokenization at word, n-gram, sentence, or custom regex levels.
1,200 stars.
Stars
1,200
Forks
183
Language
R
License
—
Category
Last pushed
Feb 21, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/juliasilge/tidytext"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
quanteda/quanteda
An R package for the Quantitative Analysis of Textual Data
keyATM/keyATM
An R package for Keyword Assisted Topic Models
gagolews/stringi
Fast and Portable Character String Processing in R (with the Unicode ICU)
ropensci/gutenbergr
Search, download, and process public domain texts from Project Gutenberg
irudnyts/openai
An R package-wrapper around OpenAI API