CornellNLP/ConvoKit

ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use of the toolkit on these datasets.

84
/ 100
Verified

Built on a scikit-learn-compatible pipeline architecture, ConvoKit provides specialized modules for linguistic coordination analysis, politeness strategy detection, conversational hypergraph extraction, and neural forecasting of conversation outcomes (e.g., derailment prediction). The toolkit implements research-backed NLP features including utterance-level likelihood modeling, speaker linguistic diversity measurement, and pivotal moment identification, all operating on a unified Corpus data structure that standardizes metadata across diverse conversation sources.

625 stars and 2,984 monthly downloads. Actively maintained with 5 commits in the last 30 days. Available on PyPI.

Maintenance 16 / 25
Adoption 18 / 25
Maturity 25 / 25
Community 25 / 25

How are scores calculated?

Stars

625

Forks

136

Language

Jupyter Notebook

License

MIT

Last pushed

Mar 13, 2026

Monthly downloads

2,984

Commits (30d)

5

Dependencies

24

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/CornellNLP/ConvoKit"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.