etl-pipeline

A scalable general purpose micro-framework for defining dataflows. You can use it to build dataframes, numpy matrices, python objects, ML models, etc. Embed Hamilton anywhere python runs, e.g. spark, airflow, jupyter, fastapi, python scripts, etc.

python data-science machine-learning etl numpy pandas data-engineering data-platform data-analysis software-engineering feature-engineering dataframe dag hamiltonian etl-framework hamilton featurization etl-pipeline stitch-fix

Updated Apr 3, 2023
Python

techascent / tech.ml.dataset

Star

A Clojure high performance data processing system

java machine-learning clojure csv xlsx datascience dataset dataframe etl-pipeline

Updated Mar 29, 2023
Clojure

etlbox / etlbox

Star

A lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.

etl csharp-core etl-framework etl-pipeline etl-jobs

Updated Jul 21, 2022
C#

SETL-Framework / setl

Star

A simple Spark-powered ETL framework that just works 🍺

data-science machine-learning framework scala big-data spark pipeline etl data-transformation data-engineering dataset data-analysis modularization setl etl-pipeline

Updated Apr 3, 2023
Scala

realize-engineering / pipebird

Star

Pipebird is open source infrastructure for securely sharing data with customers.

open-source self-hosted data-engineering data-sharing data-pipelines data-pipeline etl-pipeline data-infrastructure

Updated Mar 15, 2023
TypeScript

Indexical-Metrics-Measure-Advisory / watchmen-matryoshka-doll

Star

Watchmen Platform is a low code data platform for data pipeline, meta data management , analysis, and quality management

visualization charts pipeline data-visualization data-pipeline etl-pipeline data-quality-monitoring

Updated Apr 28, 2022
Python

patterns-app / patterns-devkit

Star

Data pipelines from re-usable components

data-science sql etl pipelines immutability data-engineering functional-reactive-programming data-analysis data-pipelines data-pipeline etl-framework etl-pipeline etl-pipelines

Updated Mar 30, 2023
Python

usc-isi-i2 / dig-etl-engine

Star

Download DIG to run on your laptop or server.

search-engine crawling information-extraction information-visualization etl-framework etl-pipeline

Updated Jan 9, 2019

restarone / violet_rails

Star

a web engine for your business. Powered by a top-shelf, modern Ruby & JS stack. Out of the box support for Automation, CMS, blog, forum and email. Developer friendly & easily extendable for your SaaS/XaaS project. Built with familiar tooling including Devise, Sidekiq, Ember.js & PostgreSQL

Updated Apr 3, 2023
Ruby

Wittline / uber-expenses-tracking

Sponsor

Star

The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.

python aws uber power-bi data-engineering data-modeling aws-redshift airflow-docker uber-data apache-airflow etl-pipeline uber-eats expenses-dashboard expenses-tracker

Updated Jun 29, 2022
Jupyter Notebook

martandsingh / ApacheSpark

Star

This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.

sql database spark hive hadoop etl pyspark data-engineering spark-streaming data-analysis databricks datalake spark-sql timetravel apachespark etl-pipeline deltalake

Updated Oct 18, 2022
Python

maxim2266 / csvplus

Star

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

go csv etl stream-processing fluent-interface csv-format go-csv etl-framework etl-pipeline

Updated Jul 22, 2021
Go

cyber-drop / ethereum_analytical_db

Star

Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.

api etl clickhouse ethereum blockchain eth dex erc20 erc223 etl-pipeline erc721 ethereum-etl

Updated Jul 13, 2022
HTML

Improve this page

Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more

etl-pipeline

Here are 950 public repositories matching this topic...

orchest / orchest

apache / incubator-streampark

AlexIoannides / pyspark-example-project

san089 / goodreads_etl_pipeline

san089 / Udacity-Data-Engineering-Projects

stitchfix / hamilton

YotpoLtd / metorikku

DAGWorks-Inc / hamilton

techascent / tech.ml.dataset

etlbox / etlbox

SETL-Framework / setl

realize-engineering / pipebird

Indexical-Metrics-Measure-Advisory / watchmen-matryoshka-doll

patterns-app / patterns-devkit

usc-isi-i2 / dig-etl-engine

restarone / violet_rails

Wittline / uber-expenses-tracking

martandsingh / ApacheSpark

maxim2266 / csvplus

cyber-drop / ethereum_analytical_db

Improve this page

Add this topic to your repo