-
locationtech-labs/geopyspark 0.3.0
GeoTrellis for PySpark
Scala versions: 2.11 -
grouzen/zio-apache-parquet 0.1.6
Scala ZIO-powered Apache Parquet library
Scala versions: 3.x 2.13 -
timgent/data-flare 3.2.0_0.1.14
Data quality control tool built on spark and deequ
Scala versions: 2.12 -
absaoss/pramen 1.0.1
Resilient data pipeline framework running on Apache Spark
Scala versions: 2.13 2.12 2.11 -
grouzen/zio-apache-arrow 0.1.3
Scala ZIO-powered Apache Arrow library
Scala versions: 3.x 2.13 2.12 -
catboost/catboost 1.2.7
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Scala versions: 2.13 2.12 -
h2oai/h2o-3 3.30.0.3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Scala versions: 2.11