-
setl-framework/setl
A simple Spark-powered ETL framework that just works 🍺
Scala versions: 2.12 2.11 -
azure/azure-cosmosdb-spark
Apache Spark Connector for Azure Cosmos DB
Scala versions: 2.11 2.10 -
leobenkel/zparkio
Boiler plate framework to use Spark and ZIO together.
Scala versions: 2.11 -
sparkling-graph/sparkling-graph
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Scala versions: 2.11 2.10 -
housepower/spark-clickhouse-connector
Spark ClickHouse Connector build on DataSourceV2 API
Scala versions: 2.13 2.12 -
clustering4ever/clustering4ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Scala versions: 2.11 -
zouzias/spark-lucenerdd
Spark RDD with Lucene's query and entity linkage capabilities
Scala versions: 2.12 2.11 2.10 -
streamnative/pulsar-spark
Spark Connector to read and write with Pulsar
Scala versions: 2.13 2.12 2.11 -
aliyun/aliyun-emapreduce-datasources
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
Scala versions: 2.11 2.10 -
indix/schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Scala versions: 2.11 -
microsoft/mobius
C# and F# language binding and extensions to Apache Spark
Scala versions: 2.11 2.10 -
chermenin/spark-states
Custom state store providers for Apache Spark
Scala versions: 2.12 2.11 -
smart-data-lake/smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Scala versions: 2.13 2.12 2.11 -
galliaproject/gallia-core
A schema-aware Scala library for data transformation
Scala versions: 3.x 2.13 2.12