TatevKaren/data-science-popular-algorithms
Data Science algorithms and topics that you must know. (Newly Designed) Recommender Systems, Decision Trees, K-Means, LDA, RFM-Segmentation, XGBoost in Python, R, and Scala.
Implements collaborative filtering for movie recommendations using item-based nearest-neighbor matching on the MovieLens 20M dataset, alongside foundational algorithms like LDA for classification, K-means for unsupervised clustering, and decision trees for interpretable predictions. The repository pairs theoretical papers with multi-language implementations (Python, R, Scala) and includes a novel Cluster Dynamics algorithm that predicts customer migration between segments based on probabilistic class distributions. Each module combines mathematical foundations with practical case studies and step-by-step evaluation methodologies across recommendation systems, dimensionality reduction, and segmentation tasks.
134 stars. No commits in the last 6 months.
Stars
134
Forks
39
Language
Jupyter Notebook
License
—
Category
Last pushed
Dec 21, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/TatevKaren/data-science-popular-algorithms"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
SENATOROVAI/Normal-equation-solver-multiple-linear-regression-course
Multiple Linear Regression (MLR) models the linear relationship between a continuous dependent...
SENATOROVAI/Normal-equations-scalar-form-solver-simple-linear-regression-course
The normal equations for simple linear regression are a system of two linear equations used to...
SENATOROVAI/underfitting-overfitting-polynomial-regression-course
Underfitting and overfitting are critical concepts in machine learning, particularly when using...
stabgan/Multiple-Linear-Regression
Implementation of Multiple Linear Regression both in Python and R
andrescorrada/IntroductionToAlgebraicEvaluation
A collection of essays and code on algebraic methods to evaluate noisy judges on unlabeled test data.