All Data Engineering Tools
517 tools ranked by quality score · Page 5 of 6
| # | Tool | Score | Tier |
|---|---|---|---|
| 401 |
calbergs/spotify-api
Pipeline that extracts data from the Spotify API to build a more detailed... |
|
Emerging |
| 402 |
turbot/steampipe-plugin-virustotal
Use SQL to instantly query file, domain, URL and IP scanning results from VirusTotal. |
|
Emerging |
| 403 |
prefeitura-rio/pipelines_rj_smtr
Códigos de captura e tratamento de dados da SMTR |
|
Emerging |
| 404 |
tenzir/library
Packages for the Tenzir ecosystem. |
|
Emerging |
| 405 |
SentryPeer/SentryPeerHQ
Fraud Detection for VoIP. Use SentryPeer® HQ to help prevent VoIP... |
|
Emerging |
| 406 |
mlr-org/mlr3db
Data Backends to let mlr3 work transparently with (remote) data bases |
|
Emerging |
| 407 |
bytehub-ai/bytehub
ByteHub: making feature stores simple |
|
Emerging |
| 408 |
tushar2704/SQL-Portfolio
Collection of personal SQL projects and queries I've worked on, showcasing... |
|
Emerging |
| 409 |
mbari-org/aidata
(ETL) Extract, transform, load/download and augment images and annotations... |
|
Emerging |
| 410 |
BigData-Ananlysiser/UGC-Analysiser
一个开源的全栈大数据项目,主要包含实时数据采集/机器学习/大数据处理/前端可视化 |
|
Emerging |
| 411 |
Paulescu/bytewax-hopsworks-example
Compute and store real-time features for crypto trading using Bytwax (stream... |
|
Emerging |
| 412 |
thinkall/featcopilot
Next-generation LLM-powered auto feature engineering framework |
|
Emerging |
| 413 |
pkochanowicz/n8n-setup-docker
Fast, safe and smart setup for self-hosted n8n placed in a Docker container,... |
|
Emerging |
| 414 |
moj-analytical-services/iam_builder
Little helper to write IAM policies |
|
Emerging |
| 415 |
jtakish/airflow-provider-sap-hana
Airflow provider package for SAP HANA |
|
Emerging |
| 416 |
GSA/coe-hud-acquisitions
A repository that contains links and information for acquisitions and... |
|
Experimental |
| 417 |
AmirhosseinHonardoust/Data-Storytelling-Dashboard
A fully interactive data storytelling dashboard for e-commerce analytics.... |
|
Experimental |
| 418 |
IgorNatann/project_e_commerce_dw
DW de e-commerce (Kimball/Star Schema) em SQL Server, com scripts, dados... |
|
Experimental |
| 419 |
runprism/prism
Prism is the easiest way to develop, orchestrate, and execute data pipelines... |
|
Experimental |
| 420 |
apache/seatunnel-tools
SeaTunnel is a multimodal, high-performance, distributed, massive data... |
|
Experimental |
| 421 |
bruin-data/setup-bruin
Official action to install Bruin CLI in Github Actions. |
|
Experimental |
| 422 |
cderickson/Mox-Data.com
Mox-Data.com is a cloud-based data ingestion tool used to process raw data... |
|
Experimental |
| 423 |
TJAdryan/astro_blog
This site uses the amazing Astro.build project. I added **Google Docs** ... |
|
Experimental |
| 424 |
peter115342/soccer-tracker-DE-project
End-To-End Data Engineering Project. Made to learn some common data... |
|
Experimental |
| 425 |
richban/opendata-stack-platform
Open Data Stack Platform: a collection of projects and pipelines built with... |
|
Experimental |
| 426 |
vnvo/deltaforge
A versatile, high-performance Change Data Capture (CDC) engine built in... |
|
Experimental |
| 427 |
turbot/steampipe-plugin-imap
Use SQL to instantly query mailboxes, messages and more using IMAP. Open... |
|
Experimental |
| 428 |
eventvisor/eventvisor
Fine-grained control over analytics events and logs via remote configuration |
|
Experimental |
| 429 |
lezwon/CatalystOps
Semantic cost-linting and performance warnings extension for Databricks in VS Code |
|
Experimental |
| 430 |
turbot/steampipe-plugin-openapi
Use SQL to instantly query resources from OpenAPI. Open source CLI. No DB required. |
|
Experimental |
| 431 |
Hyperwindmill/morphql
Transform data with queries |
|
Experimental |
| 432 |
Mindbaz/python-gpostmaster-domains-datas
Downloads and flattends datas from Google Postmaster Tools (GPT) |
|
Experimental |
| 433 |
turbot/steampipe-plugin-digitalocean
Use SQL to instantly query droplets, VPCs, users and more from DigitalOcean.... |
|
Experimental |
| 434 |
SourceWatcher/source-watcher-core
PHP ETL engine with pluggable steps: extractors, transformers, loaders |
|
Experimental |
| 435 |
TheCocoTeam/source-watcher-core
PHP ETL engine for building extract–transform–load pipelines with pluggable... |
|
Experimental |
| 436 |
tvs-sde/oxford-omop-data-mapper
A documentation-centric DuckDB based ETL tool, implementing transformations... |
|
Experimental |
| 437 |
sopho-tech/sopho
Open Source Business Intelligence |
|
Experimental |
| 438 |
MTSWebServices/horizon
Simple HWM Store backend |
|
Experimental |
| 439 |
turbot/steampipe-plugin-supabase
Use SQL to instantly query Supabase resources. Open source CLI. No DB required. |
|
Experimental |
| 440 |
turbot/steampipe-plugin-docker
Use SQL to instantly query Dockerfile commands and more from Docker. Open... |
|
Experimental |
| 441 |
turbot/steampipe-plugin-namecheap
Use SQL to instantly query Namecheap for domains, DNS host records & more.... |
|
Experimental |
| 442 |
turbot/steampipe-plugin-ibm
Use SQL to instantly query instances, networks, users and more from IBM... |
|
Experimental |
| 443 |
turbot/steampipe-plugin-jumpcloud
Use SQL to instantly query resources from JumpCloud. Open source CLI. No DB required. |
|
Experimental |
| 444 |
turbot/steampipe-plugin-linode
Use SQL to instantly query instances, domains and more from Linode. Open... |
|
Experimental |
| 445 |
turbot/steampipe-plugin-onepassword
Use SQL to instantly query 1Password vaults, items, files & more. Open... |
|
Experimental |
| 446 |
everycure-org/kedro-argo
argo-kedro is a kedro-plugin for executing Kedro pipelines on Argo Workflows. |
|
Experimental |
| 447 |
MTSWebServices/etl-entities
Basic ETL Entity classes for onETL |
|
Experimental |
| 448 |
sul-dlss/libsys-airflow
Airflow DAGS for migrating and managing ILS data into FOLIO along with other... |
|
Experimental |
| 449 |
lyrasis/kiba-extend
Extensions to Kiba ETL |
|
Experimental |
| 450 |
illuin-tech/data-pipeline
Library for describing data transformation pipelines by compositing simple... |
|
Experimental |
| 451 |
tracebloc/data-ingestors
tracebloc data pipeline for training/test dataset setup |
|
Experimental |
| 452 |
tarek-clarke/resilient-rap-framework
A resilient, fault‑tolerant telemetry analytics pipeline designed to... |
|
Experimental |
| 453 |
edwinweber/dbt_duckdb_demo_public
Data engineering demo project for Danish Parliament (Folketing) open data —... |
|
Experimental |
| 454 |
neo-technology-field/python-etl-lib
simple lib of ETL building blocks |
|
Experimental |
| 455 |
chayansraj/Python-ETL-pipeline-using-Airflow-on-AWS
This project demonstrates how to build and automate an ETL pipeline written... |
|
Experimental |
| 456 |
nvisycom/runtime
Enterprise-grade multimodal redaction runtime that detects and removes... |
|
Experimental |
| 457 |
zovchik0v/task-management
🛠️ Streamline task management with this full-stack solution featuring... |
|
Experimental |
| 458 |
KasperOmsK/pipefn
pipefn is a Go library for building lazy, functional, and composable... |
|
Experimental |
| 459 |
turbot/steampipe-plugin-aiven
Use SQL to instantly query Aiven accounts, projects, teams, users & more.... |
|
Experimental |
| 460 |
turbot/steampipe-plugin-trello
Use SQL to instantly query Trello organizations, boards, members,... |
|
Experimental |
| 461 |
turbot/steampipe-plugin-env0
Use SQL to instantly query env0 resources. Open source CLI. No DB required. |
|
Experimental |
| 462 |
turbot/steampipe-plugin-heroku
Use SQL to instantly query apps, dynos and more from Heroku. Open source... |
|
Experimental |
| 463 |
turbot/steampipe-plugin-fly
Use SQL to instantly query fly.io resources. Open source CLI. No DB required. |
|
Experimental |
| 464 |
turbot/steampipe-plugin-fastly
Use SQL to instantly query services, ACLs and more from Fastly. Open source... |
|
Experimental |
| 465 |
turbot/steampipe-plugin-urlscan
Use SQL to instantly query urlscan.io. Open source CLI. No DB required. |
|
Experimental |
| 466 |
turbot/steampipe-plugin-updown
Use SQL to instantly query status (e.g. checks, downtimes) from updown.io.... |
|
Experimental |
| 467 |
turbot/steampipe-plugin-awscfn
Use SQL to instantly query resources, data sources and more from AWS... |
|
Experimental |
| 468 |
tbrus/smartjoin
Deterministic key and join discovery for structured datasets |
|
Experimental |
| 469 |
qweliant/ankaa
POC for real-time monitoring and alert system for home hemodialysis,... |
|
Experimental |
| 470 |
turbot/steampipe-plugin-panos
Use SQL to instantly query PAN-OS firewalls, security policies & more. Open... |
|
Experimental |
| 471 |
turbot/steampipe-plugin-newrelic
Use SQL to instantly query alerts, events, and more from New Relic. Open... |
|
Experimental |
| 472 |
turbot/steampipe-plugin-planetscale
Use SQL to instantly query PlanetScale databases, branches and more. Open... |
|
Experimental |
| 473 |
turbot/steampipe-plugin-mailchimp
Use SQL to instantly query Mailchimp marketing data. Open source CLI. No DB required. |
|
Experimental |
| 474 |
turbot/steampipe-plugin-vercel
Use SQL to instantly query projects, teams, domains and more from Vercel.... |
|
Experimental |
| 475 |
turbot/steampipe-plugin-splunk
Use SQL to instantly query logs, indexes, apps and more Splunk. Open source... |
|
Experimental |
| 476 |
turbot/steampipe-plugin-pipes
Use SQL to instantly query Turbot Pipes resources across workspaces. Open... |
|
Experimental |
| 477 |
nicopon/dtpipe
A simple, self-contained CLI for performance-focused data streaming & anonymization. |
|
Experimental |
| 478 |
faltz009/Closure-SDK
A hash you can do algebra on — composable verification for ordered data over... |
|
Experimental |
| 479 |
vishnuvardhanaan/equity-fundamental-engine
Production-style financial data engineering pipeline that standardizes NSE... |
|
Experimental |
| 480 |
alireza-heidarii/Real-Time-Data-Cleaning-Pipeline-for-Medical-and-Healthcare-Data
A real-time data cleaning pipeline for medical and healthcare data using... |
|
Experimental |
| 481 |
vishnuvardhanaan/equity-fundamental-analytics
Macro-aware, explainable equity analytics system using Bronze–Silver–Gold... |
|
Experimental |
| 482 |
RaySatish/Market-Surveillance-System
Big-data pipeline detecting wash trading, pump & dump, and spoofing in trade... |
|
Experimental |
| 483 |
pablo-reyes8/colombia-tourism-ml-forecasting
ML project forecasting monthly foreign tourist arrivals in Colombian cities... |
|
Experimental |
| 484 |
elevata-labs/elevata
elevata is an Architecture Runtime for modern data platforms —... |
|
Experimental |
| 485 |
adhamhaithameid/Classroom-Quick-Downloader
A sophisticated cross-browser extension for bulk Google Classroom downloads,... |
|
Experimental |
| 486 |
Galaticos-API/API-3
Projeto da API do primeiro semestre de 2026 |
|
Experimental |
| 487 |
MTSWebServices/horizon-hwm-store
Horizon HWM Store for onETL |
|
Experimental |
| 488 |
idlab-discover/RustiFlow
Flow feature extraction tool built in Rust using eBPF |
|
Experimental |
| 489 |
tosh2230/stairlight
A data lineage tool detects table dependencies from rendered SQL statements. |
|
Experimental |
| 490 |
fishstormX/fishmaple
个人网站 https://www.fishmaple.cn |
|
Experimental |
| 491 |
BirdiD/BirdiDQ
BirdiDQ leverages the power of the Python Great Expectations open-source... |
|
Experimental |
| 492 |
Wazzabeee/pyspark-etl-twitter
Implementation of an ETL process for real-time sentiment analysis of tweets... |
|
Experimental |
| 493 |
AmirhosseinHonardoust/Market-IQ
MarketIQ is a full-stack Streamlit + SQL + Prophet dashboard for real-time... |
|
Experimental |
| 494 |
NileDB/com.niledb.core
Open-source Data Backend written in Java and based on PostgreSQL & GraphQL. |
|
Experimental |
| 495 |
pmutua/drf_csv_xlsx_file_upload
Demo Django (Django Rest Framework) API uploads .csv/.xlsx for bulk data,... |
|
Experimental |
| 496 |
AmirhosseinHonardoust/Beyond-Charts-Interactive-Storytelling
A comprehensive guide and codebase for building interactive storytelling... |
|
Experimental |
| 497 |
contriboss/no_fly_list
A flexible, high-performance tagging system for Rails applications with... |
|
Experimental |
| 498 |
MaxHalford/tuna
:fish: A streaming ETL for fish |
|
Experimental |
| 499 |
ThinkThinkAI/ThinkDB
ThinkDB is an easy-to-use SQL client that makes working with your databases... |
|
Experimental |
| 500 |
aymane-maghouti/Big-Data-Project
This project aims to predict smartphone prices using a combination of batch... |
|
Experimental |