Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
A simple Spark-powered ETL framework that just works 🍺
lakeFS - Data version control for your data lake | Git for data
A schema-aware Scala library for data transformation
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
A re-implementation of Hadoop DistCP in Apache Spark
The hyppo data ingestion system worker components