-
microsoft/synapseml
Simple and Distributed Machine Learning
Scala versions: 2.12 2.11 -
johnsnowlabs/spark-nlp
State of the Art Natural Language Processing
Scala versions: 2.12 2.11 -
salesforce/transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Scala versions: 2.11 -
almond-sh/almond
A Scala kernel for Jupyter
Scala versions: 3.x 2.13 2.12 2.11 -
combust/mleap
MLeap: Deploy ML Pipelines to Production
Scala versions: 2.12 2.11 2.10 -
tibcosoftware/snappydata
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
Scala versions: 2.11 2.10 -
h2oai/sparkling-water
Sparkling Water provides H2O functionality inside Spark cluster
Scala versions: 2.12 2.11 2.10 -
frees-io/freestyle
A cohesive & pragmatic framework of FP centric Scala libraries
Scala versions: 2.12 2.11Scala.js versions: 0.6 -
lucacanali/sparkmeasure
This is the development repository for sparkMeasure, a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task and stage metrics data.
Scala versions: 2.13 2.12 2.11 -
delta-io/delta-sharing
An open protocol for secure data sharing
Scala versions: 2.13 2.12