-
combust/mleap 0.23.1
MLeap: Deploy ML Pipelines to Production
Scala versions: 2.12 -
apache/sedona 1.6.1
A cluster computing framework for processing large-scale geospatial data
Scala versions: 2.13 2.12 -
lucacanali/sparkmeasure 0.24
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
Scala versions: 2.13 2.12 -
scalapy/scalapy 0.5.2
Use the world of Python from the comfort of Scala!
Scala versions: 3.x 2.13 2.12Scala Native versions: 0.4 -
locationtech-labs/geopyspark 0.3.0
GeoTrellis for PySpark
Scala versions: 2.11 -
isarn/isarn-sketches-spark 0.6.0-sp3.2
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Scala versions: 2.12 -
ozancicek/artan 0.5.1
Online latent state estimation with Spark
Scala versions: 2.12 -
timvw/adobe-analytics-datafeed-datasource 0.1.0
Apache Spark data source for Adobe Analytics Data Feed
Scala versions: 2.12 -
liquidsvm/liquidsvm 0.6.0
Support vector machines (SVMs) and related kernel-based learning algorithms are a well-known class of machine learning algorithms, for non-parametric classification and regression. liquidSVM is an implementation of SVMs whose key features are: fully integrated hyper-parameter selection, extreme speed on both small and large data sets, full flexibility for experts, and inclusion of a variety of different learning scenarios: multi-class classification, ROC, and Neyman-Pearson learning, and least-squares, quantile, and expectile regression.
Scala versions: 2.11 -
catboost/catboost 1.2.7
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Scala versions: 2.13 2.12 2.11