Here are
26 public repositories
matching this topic...
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Updated
Apr 13, 2022
Python
Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.
Updated
Jul 12, 2017
Java
Toy Hadoop cluster combining various SQL-on-Hadoop variants
Updated
Nov 16, 2017
Shell
This repository contains a simple Hadoop-like (MapReduce) distributed computing platform implemented in Java. It is extended from a course project at UIUC awarded the best Java version implementation and it's open-sourced for reference.
Updated
Apr 17, 2021
Java
Updated
Apr 12, 2022
Java
A storage reference to a comprehensive guide on installing Hadoop on Windows
Updated
Jun 11, 2018
Shell
Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark
Updated
Aug 17, 2018
Java
The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
Updated
Jan 20, 2020
Scala
EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5
Updated
Jan 6, 2020
Shell
This Project focuses on creating a KNN MapReduce program for the Hadoop Framework
Setup hadoop cluster manually and automatically
Updated
Jul 17, 2017
Python
PageRank algorithm written in Java MapReduce framework
Updated
Jul 20, 2019
Java
The repo contains the steps for setting up the single node cluster in Hadoop 3.2.1 in Ubuntu 20.04 LTS
Product recommendation system on Amazon product dataset using Apache Spark framework
Updated
Jun 15, 2018
Jupyter Notebook
Twitter data analysis using hadoop (hdfs), flume, map-reduce and hive. Sentiment Analysis is also done using affin dictionary for tweets related to Indian election.
I installed Hadoop on Virtual Machine and all Assignments are performed on Ubuntu OS. Refer to this repo for completion of the Hadoop Assignments. It is recommended that you have a stable internet connection while doing these things.
Updated
Jun 30, 2021
Rebol
Updated
Jul 11, 2018
Python
An Ansible Role to Configure and setup Hive Data WareHouse on Client Node.
WQD7008 Parallel and Distributed Computing Project
Updated
Sep 23, 2017
Python
Distributed Hadoop and Spark based framework for in-memory GIS queries
Updated
Apr 22, 2019
Java
Basic spark examples to scratch some ground
Python Scripts for working with Big Data Files
Updated
Apr 6, 2018
Python
Titanic data analysis with Hadoop
Updated
Nov 13, 2018
Java
Improve this page
Add a description, image, and links to the
hadoop-framework
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
hadoop-framework
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.