-
agile-lab-dev/wasp 3.0.1
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Scala versions: 2.12 -
sansa-stack/archived-sansa-inference 0.7.1
A general Inference API based on two of the most popular Big Data processing engines: Apache Spark and Apache Flink
Scala versions: 2.11 -
fsanaulla/chronicler 0.7.2
Scala toolchain for InfluxDB
Scala versions: 2.13 2.12 2.11 -
sansa-stack/archived-sansa-owl 0.7.1
SANSA Stack OWL (Web Ontology Language) API
Scala versions: 2.11 -
locationtech/rasterframes 0.11.1
Geospatial Raster support for Spark DataFrames
Scala versions: 2.12 -
absaoss/pramen 1.0.1
Resilient data pipeline framework running on Apache Spark
Scala versions: 2.13 2.12 2.11 -
whylabs/whylogs-java 0.1.3
Profile and monitor your ML data pipeline end-to-end
Scala versions: 2.12 -
isarn/isarn-sketches-spark 0.6.0-sp3.2
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Scala versions: 2.12 -
arcizon/spark-filetransfer 0.3.0
API for reading and writing data via various file transfer protocols from Apache Spark.
Scala versions: 2.12 2.11 -
romans-weapon/spear-framework 3.1.1-3.0
Rapid ETL/ELT-connectors/pipeline development leveraged on top of Apache Spark
Scala versions: 2.12 -
florentf9/sparkml-som 0.2
:sparkles: Spark ML implementation of SOM algorithm (Kohonen self-organizing map)
Scala versions: 2.11 -
s22s/pre-lt-raster-frames 0.6.1
Spark DataFrames for earth observation data
Scala versions: 2.11 -
qubole/streaminglens 0.5.3
Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
Scala versions: 2.11