DataTalksClub/data-engineering-zoomcamp

Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

59
/ 100
Established

The curriculum covers the complete data engineering stack—from containerization and infrastructure-as-code (Docker, Terraform, GCP) through workflow orchestration (Kestra), data warehousing (BigQuery), analytics engineering (dbt), and streaming systems (Kafka, KSQL)—with hands-on modules using industry tools like Apache Spark, dlt for data ingestion, and Bruin for end-to-end pipelines. Students build a real-world final project with peer review, reinforcing concepts across batch processing, partitioning strategies, schema management, and deployment to cloud platforms. The course assumes only basic coding and SQL knowledge, making it accessible while maintaining production-grade rigor through integration with modern data platforms.

39,193 stars. Actively maintained with 4 commits in the last 30 days.

No License No Package No Dependents
Maintenance 16 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 25 / 25

How are scores calculated?

Stars

39,193

Forks

7,884

Language

Jupyter Notebook

License

Last pushed

Mar 19, 2026

Commits (30d)

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/DataTalksClub/data-engineering-zoomcamp"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.