Make Flink|Spark easier!!! The original intention of StreamX is to make the development of Flink easier. StreamX focuses on the management of development phases and tasks. Our ultimate goal is to build a one-stop big data solution integrating stream processing, batch processing, data warehouse and data laker.
*Decompile All the Things* - IDA Batch Decompile plugin and script for Hex-Ray's IDA Pro that adds the ability to batch decompile multiple files and their imports with additional annotations (xref, stack var size) to the pseudocode .c file
Automate Excel downloads in seconds. Simply does the tedious, repetitive operations for all rows of excel files step by step and reports after the job is done. It can download files from URL(s) in a column of Excel files. If a new filename is provided at column B it will rename the file before saving. It will even create sub folders if column C is filled with a valid folder name.
Save optimal bins and/or splits points and transformation depending on the target type.
This is handy to implement binning transformation with other languages (e.g., SAS). Only a JSON parser is required to produce a set of if-else to transform original data.
Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Shell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
For example, given a simple pipeline such as:
I'd like
aggregator
to be something requiring a non-serialisable dependency to do its work.I know I can do this: