A pipeline that consumes twitter data to extract meaningful insights about a variety of topics using the following technologies: twitter API, Kafka, MongoDB, and Tableau.
This project implements an end-to-end pipeline for batch to stream processing using Kafka on Docker images. Besides that, there are exercises for a full course in Kafka.
Stream real time Tweets of current affairs like covid-19 using Kafka 2.0.0 high throughput producer & consumer into Elasticsearch using safe, idempotent and compression configurations. Aggregate the data and use it for further analytics.