juliasilge/tidytext

Text mining using tidy tools :sparkles::page_facing_up::sparkles:

/ 100

Established

Converts text to one-token-per-row dataframes via `unnest_tokens()`, enabling seamless integration with dplyr, tidyr, and ggplot2 for analysis. Provides built-in sentiment lexicons and stopword lists, plus functions to convert between tidy formats and document-term matrices from the tm package. Leverages the tokenizers package for flexible tokenization at word, n-gram, sentence, or custom regex levels.

1,200 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

1,200

Forks

183

Language

License

—

Category

text-analysis-frameworks

Last pushed

Feb 21, 2026

Commits (30d)

GitHub

Text Analysis Frameworks · 80 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/juliasilge/tidytext"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Related tools

quanteda/quanteda

An R package for the Quantitative Analysis of Textual Data

keyATM/keyATM

An R package for Keyword Assisted Topic Models

gagolews/stringi

Fast and Portable Character String Processing in R (with the Unicode ICU)

ropensci/gutenbergr

Search, download, and process public domain texts from Project Gutenberg

irudnyts/openai

An R package-wrapper around OpenAI API

Explore NLP Tools

All categories Trending NLP directory Insights