endymecy/spark-ml-source-analysis
spark ml 算法原理剖析以及具体的源码实现分析
Provides in-depth algorithmic analysis and distributed implementation details for 30+ Spark ML algorithms across classification, clustering, dimensionality reduction, and feature engineering, covering both mathematical foundations and Scala source code walkthroughs. Targets Spark 1.6.1 and 2.x versions with comprehensive documentation of optimization techniques (gradient descent, L-BFGS, NNLS) and tree-based ensemble methods. Structured as an educational reference covering statistical foundations, collaborative filtering via ALS, and feature transformation pipelines through detailed code commentary and formula derivations.
1,962 stars. No commits in the last 6 months.
Stars
1,962
Forks
821
Language
—
License
Apache-2.0
Category
Last pushed
Mar 25, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/endymecy/spark-ml-source-analysis"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
lensacom/sparkit-learn
PySpark + Scikit-learn = Sparkit-learn
Angel-ML/angel
A Flexible and Powerful Parameter Server for large-scale machine learning
databricks/spark-sklearn
(Deprecated) Scikit-learn integration package for Apache Spark
kaiwaehner/kafka-streams-machine-learning-examples
This project contains examples which demonstrate how to deploy analytic models to...
alibaba/Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of...