piskvorky/gensim
Topic Modelling for Humans
Implements memory-efficient, out-of-core processing of large corpora using Python generators and NumPy/BLAS backends, enabling algorithms like LDA, LSA, and word2vec to scale beyond available RAM. Provides a streaming architecture with pluggable transformation pipelines and distributed computing support for LSA/LDA across clusters. Built on vector space models for document indexing, similarity retrieval, and unsupervised text analysis in NLP and information retrieval workflows.
16,375 stars.
Stars
16,375
Forks
4,410
Language
Python
License
LGPL-2.1
Category
Last pushed
Nov 01, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/piskvorky/gensim"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
bigartm/bigartm
Fast topic modeling platform
vi3k6i5/GuidedLDA
semi supervised guided topic model with custom guidedLDA
gregversteeg/corex_topic
Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
microsoft/knowledge-extraction-recipes-forms
Knowledge Extraction For Forms Accelerators & Examples
centre-for-humanities-computing/tweetopic
Blazing fast topic modelling for short texts.