All Data Engineering Tools

1,297 tools ranked by quality score · Page 7 of 13

Showing 601–700 of 1,297
# Tool Score Tier
601 vishnuvardhanaan/equity-fundamental-engine

Production-style financial data engineering pipeline that standardizes NSE...

29
Experimental
602 jack-tol/usda-food-data-pipeline

Code for the USDA Branded Food Dataset pipeline and the USDA Food Assistant....

29
Experimental
603 JulieGibbs/greenback-java

Java library to build modern applications with high-def itemized financial...

29
Experimental
604 lezwon/CatalystOps

Semantic cost-linting and performance warnings extension for Databricks in VS Code

29
Experimental
605 Hyperwindmill/morphql

Transform data with queries

29
Experimental
606 Wazzabeee/pyspark-etl-twitter

Implementation of an ETL process for real-time sentiment analysis of tweets...

29
Experimental
607 BirdiD/BirdiDQ

BirdiDQ leverages the power of the Python Great Expectations open-source...

29
Experimental
608 dhchenx/Catla-HS

Catla for Hadoop and Spark (Catla-HS): An open-source system to support...

29
Experimental
609 galafis/distributed-data-processing-pipeline

Enterprise-grade distributed data processing pipeline with Apache Spark...

29
Experimental
610 Xenios91/Byte-Chomp

A Golang tool for obtaining data on Golang binaries in csv format

29
Experimental
611 aduranil/personal-finance-frontend

personal finance mint.com-like site

29
Experimental
612 victorlopes2000/retail-intelligence-platform

๐Ÿ›๏ธ Analyze retail data with our platform, scraping insights from major...

29
Experimental
613 dannyb2018/CPTADataProviderAPI

This is core library for financial data access components in Apache Nifi

29
Experimental
614 andrejanesic/Spark-News-Stock-Market-Prediction

Data science and Spark applied to 7 hypotheses regarding the DJIA stock...

29
Experimental
615 18F/bpa-disaster-data-portal-pilot

The scope of this task is to build a working pilot of a portal that collects...

29
Experimental
616 betoalien/PardoX

PardoX: The Hyper-Fast Data Engine

29
Experimental
617 FurkAlb/Global-Power-Plant-Analysis

Global Power Plant Database Analysis is a Streamlit-based interactive web...

29
Experimental
618 calbergs/spotify-api

Pipeline that extracts data from the Spotify API to build a more detailed...

29
Experimental
619 mzafram2001/football-database-fver

โšฝ Football database. Ideal for machine learning, betting and analytics. ๐Ÿ“‚...

29
Experimental
620 adhamhaithameid/Classroom-Quick-Downloader

A sophisticated cross-browser extension for bulk Google Classroom downloads,...

29
Experimental
621 nodef/extra-pg-english

Converts English query to Informal/Format SQL SELECT.

28
Experimental
622 theBlackfish01/FiberWatchCLI

CLI based interface for Optical Fiber Fault Detection, Diagnosis, and...

28
Experimental
623 ProjectXero/dbds

DBDataSource (dbds) is primarily a lightweight PostgreSQL-backed dataSource...

28
Experimental
624 maengsanha/bigdata

KMU CS Hot Topics in Big Data

28
Experimental
625 contriboss/no_fly_list

A flexible, high-performance tagging system for Rails applications with...

28
Experimental
626 Rakshan-kulkarni/Rakshan-Finance-Tracker

Rakshan/Finance Tracker

28
Experimental
627 NileDB/com.niledb.core

Open-source Data Backend written in Java and based on PostgreSQL & GraphQL.

28
Experimental
628 pmutua/drf_csv_xlsx_file_upload

Demo Django (Django Rest Framework) API uploads .csv/.xlsx for bulk data,...

28
Experimental
629 Surya-Hariharan/ESG-Sustainability-Analysis

Full-stack ESG analytics dashboard for S&P 500 companies with FastAPI,...

28
Experimental
630 abhiram-ar/humane-backend

Event-driven microservices backend for Humane, a behavior-rewarding social...

28
Experimental
631 AlvaroCavalcante/airflow-calendar-plugin

A Google Calendar-style plugin to improve your DAG management with a visual schedule

28
Experimental
632 elevata-labs/elevata

elevata is an Architecture Runtime for modern data platforms โ€”...

28
Experimental
633 faltz009/Closure-SDK

A hash you can do algebra on โ€” composable verification for ordered data over...

27
Experimental
634 nvisycom/runtime

Enterprise-grade multimodal redaction runtime that detects and removes...

27
Experimental
635 1712n/dedup-service

A high-performance service designed to eliminate duplicate and...

27
Experimental
636 nicopon/dtpipe

A simple, self-contained CLI for performance-focused data streaming & anonymization.

27
Experimental
637 Aniket-16-S/Product-Scraper

Scrapping products from well known e-com. sites like Amazon, Flipkart and...

27
Experimental
638 MaxHalford/tuna

:fish: A streaming ETL for fish

27
Experimental
639 Galaticos-API/API-3

Projeto da API do primeiro semestre de 2026

27
Experimental
640 RealAlexandreAI/io-sankey

๐Ÿงถ Framework for IO mapping and validation across heterogeneous data.

27
Experimental
641 ThinkThinkAI/ThinkDB

ThinkDB is an easy-to-use SQL client that makes working with your databases...

27
Experimental
642 hummer-team/vault77

LLM, DuckDB, Excel, CSV , Data Analysis

27
Experimental
643 aymane-maghouti/Big-Data-Project

This project aims to predict smartphone prices using a combination of batch...

27
Experimental
644 hariketsheth/BlockChain_FinTech

Cash flow is one of the most critical aspects of the supply chain, and it...

27
Experimental
645 vishnuvardhanaan/equity-fundamental-analytics

Macro-aware, explainable equity analytics system using Bronzeโ€“Silverโ€“Gold...

27
Experimental
646 tbrus/smartjoin

Deterministic key and join discovery for structured datasets

27
Experimental
647 KasperOmsK/pipefn

pipefn is a Go library for building lazy, functional, and composable...

27
Experimental
648 joaopn/social-data-pipeline

Pipeline for processing, classifying, and ingesting large-scale social data

27
Experimental
649 liuweizhenhaoa/summer

Java

27
Experimental
650 TheCocoTeam/source-watcher-core

PHP ETL engine for building extractโ€“transformโ€“load pipelines with pluggable...

27
Experimental
651 ReinerCPrecillas/Peek

๐Ÿ“Š Monitor your macOS network in real-time with Peekโ€”get instant insights on...

27
Experimental
652 nitish9413/open_auto_loader

OpenAutoLoader: A lightweight, open-source alternative to Databricks Auto...

27
Experimental
653 ankman007/cricket-statsguru

Streamlit-based Nepali cricket visualization dashboard that utilizes python...

27
Experimental
654 feitasIoT/CRose

CRose๏ผˆChina...

27
Experimental
655 edwinweber/dbt_duckdb_demo_public

Data engineering demo project for Danish Parliament (Folketing) open data โ€”...

26
Experimental
656 raphaelberly/journal

A movie journal coupled with open IMDb data, and a Flask web-app for easy...

26
Experimental
657 RaySatish/Market-Surveillance-System

Big-data pipeline detecting wash trading, pump & dump, and spoofing in trade...

26
Experimental
658 AhmedMaghawry/SPOFI

Spotfire is a crowd-sourcing tool that can support real-time detection and...

26
Experimental
659 cobluestars/dataherd-raika

"Dataherd-Raika is a library designed to simulate large-scale user behavior...

26
Experimental
660 COS301-SE-2021/Integrated-Data-Intelligence-Suite

The Integrated Data Intelligence Suite is a data-collection and data-mining...

26
Experimental
661 SermetPekin/evdschat

evdschat is an open-source Python package designed to enhance the evdspy...

26
Experimental
662 0xjgv/inconnu

Data privacy tool, for fast & thorough anonymization/pseudonymization, easy...

26
Experimental
663 zatarain/crm-dupkiller

CRM DupKiller - Hack Night @ Cloudflare 2025 ft. Fiberplane, Claude, Elevenlabs

26
Experimental
664 galafis/data-mesh-implementation-framework

Data Mesh concepts in Python - Data Products with schema validation, CRUD,...

26
Experimental
665 yamtimor/BirdLane

Kotlin DSL for expressive, code-first data pipelines, inspired by jazz.

26
Experimental
666 DonkeyKing01/EV-PM-DSS

Prototype decision-support dashboard built on the SCSI-SLM EV design insight...

26
Experimental
667 ps982182/AI-Business-Insights-Dashboard

AI-powered sales analytics dashboard built with Streamlit that generates...

25
Experimental
668 kholdrex/code_to_query

Ask for data in plain English; get validated, parameterized SQL with guardrails.

25
Experimental
669 AmirhosseinHonardoust/Market-IQ

MarketIQ is a full-stack Streamlit + SQL + Prophet dashboard for real-time...

25
Experimental
670 anjanicoder/Lok-Sabha-Election-Analysis

This project focuses on analyzing the Lok Sabha Election data of India. The...

25
Experimental
671 zsoltmester/anomaly-detector

Detect anomaly in call detail records.

25
Experimental
672 Codex56799/dataengineering

๐Ÿš€ Build a containerized data engineering workflow for NYC Yellow Taxi Trip...

25
Experimental
673 salimt/Transfermarkt-ETL-and-LIVE-Scores

asyncIO, Github Actions, GCP, dbt, Terraform, Docker

25
Experimental
674 AmirhosseinHonardoust/Beyond-Charts-Interactive-Storytelling

A comprehensive guide and codebase for building interactive storytelling...

25
Experimental
675 HatiOS-AI/HatiData-SDKs

Local-first data warehouse for AI agents. Write Snowflake-compatible SQL,...

25
Experimental
676 MostafaSensei106/FP-Growth

A high-performance Dart library for FP-Growth algorithm and association rule...

25
Experimental
677 TheoV823/cannabis-price-index

Open-source methodology, SQL, and sample data for a Cannabis Price Index....

25
Experimental
678 pandabear-neil/microsoft_fabric_mods

Code Snippets, Designs, and other things about building a Data Analytics...

24
Experimental
679 abdullahqaisar/sehatchain

SehatChain, an AI and Blockchain powered tool for researchers and healthcare...

24
Experimental
680 supaglue-labs/typescript-syncer

Quickly sync your customers' CRM data to various destinations

24
Experimental
681 tanmaytanmay47/brazilian-ecommerce-data-warehouse

๐Ÿ“Š Analyze Brazilian e-commerce data with this complete Business Intelligence...

24
Experimental
682 developmentseed/skynet-scrub-server

Backing store for developmentseed/skynet-scrub

24
Experimental
683 eduardocornelsen/full-funnel-ai-analytics

Full-Funnel AI Marketing Analytics. A modern data stack powered by dbt...

24
Experimental
684 BlackRoad-Forge/RoadHailoVision

BlackRoad Forge โ€” hailo vision โ€” BlackRoad Forge. Enhanced developer tools...

24
Experimental
685 erangi/podcasts

The list of podcasts I listen to

24
Experimental
686 Skeyelab/Zendesk-Data-Collector

Rails ETL for Zendesk โ€” collects and syncs Zendesk ticket data into PostgreSQL

24
Experimental
687 cypherpunk-symposium/blockchain-data-engineering-toolkit

๐Ÿ‘พ blockchain infrastructure projects and resources (e.g., ethereum event...

24
Experimental
688 cyclonite69/shadowcheck-web

ShadowCheck SIGINT Forensics Platform - Real-time wireless network analysis

24
Experimental
689 formeo/igaming-platform

iGaming Platform Core โ€” Wallet Service, Bonus Engine & ML-powered Fraud...

24
Experimental
690 xxxsleepygamerxxx/directly

๐Ÿš€ Accelerate your browsing with Directly, a Chromium extension for quick...

24
Experimental
691 benzsevern/goldenflow

Data transformation toolkit โ€” 43+ transforms, 5 domain packs. 10 MCP tools...

24
Experimental
692 RafiQamar/IMDb-Movie-Analysis

This project involves web scraping, data preprocessing, database storage and...

24
Experimental
693 SoftwareTree/gilhari_ecommerce_example

A RESTful Gilhari microservice demonstrating ORM for JSON objects with an...

24
Experimental
694 GSA/coe-hud-acq-advanced-analytics

A repository for information related to the Data Analytics team's Advanced...

24
Experimental
695 GSA/coe-hud-acq-data-visualization

A repository for information related to the Data Analytics team's Data...

24
Experimental
696 shrutikar/DisasterRecord

DisasterRecord- Disaster Response and Relief Coordination pipeline.

24
Experimental
697 tosh2230/stairlight

A data lineage tool detects table dependencies from rendered SQL statements.

23
Experimental
698 wapplewhite4/fastdedup

Fast, memory-efficient dataset deduplication for ML workloads

23
Experimental
699 FranusCode/credit-risk-scoring-sas

Klasyfikacja ryzyka kredytowego klientรณw banku. Projekt obejmuje inลผynieriฤ™...

23
Experimental
700 redzeptech/ASENA-ANALYSIS

ASENA-ANALYSIS: A hybrid Intrusion Detection System (IDS) that combines...

23
Experimental
« Prev 1 2 3 5 6 7 8 9 11 12 13 Next »