Scikit-learn Pipelines ML Frameworks

End-to-end ML pipeline implementations using scikit-learn, focusing on workflow orchestration, preprocessing integration, and production-ready pipeline patterns. Does NOT include general ML tutorials, dataset collections, or frameworks that don't emphasize pipeline construction.

There are 34 scikit-learn pipelines frameworks tracked. 2 score above 70 (verified tier). The highest-rated is scverse/anndata at 88/100 with 720 stars and 2,610,474 monthly downloads. 1 of the top 10 are actively maintained.

Get all 34 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=scikit-learn-pipelines&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 scverse/anndata

Annotated data.

88
Verified
2 koaning/scikit-lego

Extra blocks for scikit-learn pipelines.

70
Verified
3 googleapis/python-bigquery-dataframes

BigQuery DataFrames (also known as BigFrames)

55
Established
4 bigmlcom/python

Python bindings for BigML.io

46
Emerging
5 posit-dev/orbital

Turn SciKitLearn pipelines into SQL

44
Emerging
6 mindsdb/type_infer

Type inference for Machine Learning pipelines

39
Emerging
7 getyourguide/DDataFlow

A tool to help you to test and develop pyspark code with sampled and local data

39
Emerging
8 ibis-project/ibis-ml

IbisML is a library for building scalable ML pipelines using Ibis.

35
Emerging
9 maximtrp/tmplot

Visualization of Topic Modeling Results

35
Emerging
10 data-science-lab-amsterdam/skippa

SciKIt-learn Pipeline in PAndas

33
Emerging
11 m-nanda/End-to-End-ML

An "End-to-End Machine Learning" project focuses on building a machine...

32
Emerging
12 iMaatin/AutoStats

A libray for automatically cleaning, imputing and analyzing datasets with...

27
Experimental
13 layerai-archive/dbt-layer

Layer DBT Adapters

23
Experimental
14 galafis/distributed-data-processing-pipeline

Enterprise-grade distributed data processing pipeline with Apache Spark...

23
Experimental
15 galafis/feature-store-engineering

Feature Store Engineering - Professional Python project

23
Experimental
16 galafis/python-ml-pipeline-complete

Data Science project - python-ml-pipeline-complete

23
Experimental
17 joekakone/db-analytics-tools

Databases Analytics Tools - Data Integration - Data Visualization - Machine Learning

22
Experimental
18 zluvsand/ml_pipeline

⚡ Sample code for machine Learning Pipeline with Scikit-learn ⚡

22
Experimental
19 miheo-al2/sklearn-selector-pipeline

🔧 Combine feature selectors with classifiers and regressors in a seamless...

22
Experimental
20 spen-c/ml-portfolio

Machine learning projects built on a modular, config-driven framework...

22
Experimental
21 SathyaPrakashD/ml-pipeline-fundamentals

End-to-end scikit-learn ML pipelines across 6 datasets — classification,...

22
Experimental
22 hiazevedo/databricks-portfolio

Portfólio de projetos práticos de Data Engineering e ML com Databricks —...

22
Experimental
23 elisim/hydra-sklearn-pipelines

Code accompanying the blogpost: "Creating Configurable Data Pre-Processing...

20
Experimental
24 galafis/Machine-Learning-Pipeline

Professional project by Gabriel Demetrios Lafis

20
Experimental
25 adamduval/ml_snowflake_end_to_end

❄️ End to End ML workflow in Snowflake.

16
Experimental
26 harshraithatha/MLproject

An end-to-end machine learning pipeline for wholesale customer channel...

14
Experimental
27 oumaimabnz/python-data-processing-pipeline

End-to-end Python data processing pipeline for cleaning, analyzing, and...

14
Experimental
28 Montasir00/Ml_final_project

End-to-End process of building machine learning models

14
Experimental
29 chrislemke/sk-transformers

A collection of pandas & scikit-learn compatible transformers for...

14
Experimental
30 Fugant1/ml-model-factory

Automated ML pipeline

13
Experimental
31 Vidhi1290/Machine-learning-Pipeline

Explore a collection of Jupyter notebooks that guide you through various...

12
Experimental
32 muhammadhussain-2009/Machine-Learning-Pipeline-

Pipeline Designed to Simplify Complexities of Building ML Models

11
Experimental
33 Shashank911/-End-to-End-Machine-Learning-Pipeline

The objective of this task is to build an end-to-end machine learning...

11
Experimental
34 aabouzaid/modern-data-platform-research-paper

The resources for the peer-reviewed paper Building A Modern Data Platform...

10
Experimental