Big Data Types
A type-safe library to transform Case Classes into Database schemas and to convert implemented types into another types
Documentation
Check the Documentation website to learn more about how to use this library
Available conversions:
From / To | Scala Types | BigQuery | Spark | Cassandra | Circe (JSON) | |
---|---|---|---|---|---|---|
Scala | ![]() |
- | ✅ | ✅ | ✅ | |
BigQuery | ![]() |
- | ✅ | ✅ | ||
Spark | ![]() |
✅ | - | ✅ | ||
Cassandra | ![]() |
✅ | ✅ | - | ||
Circe (JSON) | ![]() |
✅ | ✅ | ✅ |
Versions for Scala ,
and
are available in Maven
Quick Start
The library has different modules that can be imported separately
- BigQuery
libraryDependencies += "io.github.data-tools" %% "big-data-types-bigquery" % "{version}"
- Spark
libraryDependencies += "io.github.data-tools" %% "big-data-types-spark" % "{version}"
- Cassandra
libraryDependencies += "io.github.data-tools" %% "big-data-types-cassandra" % "{version}"
- Circe (JSON)
libraryDependencies += "io.github.data-tools" %% "big-data-types-circe" % "{version}"
- Core
- To get support for abstract SqlTypes, it is included in the others, so it is not needed if you are using one of the others
libraryDependencies += "io.github.data-tools" %% "big-data-types-core" % "{version}"
In order to transform one type into another, both modules have to be imported.
How it works
The library internally uses a generic ADT (SqlType) that can store any schema representation, and from there, it can be converted into any other. Transformations are done through 2 different type-classes.
Quick examples
Case Classes to other types
//Spark
val s: StructType = SparkSchemas.schema[MyCaseClass]
//BigQuery
val bq: List[Field] = SqlTypeToBigQuery[MyCaseClass].bigQueryFields // just the schema
BigQueryTable.createTable[MyCaseClass]("myDataset", "myTable") // Create a table in a BigQuery real environment
//Cassandra
val c: CreateTable = CassandraTables.table[MyCaseClass]
There are also extension methods
that make easier the transformation between types when there are instances
//from Case Class instance
val foo: MyCaseClass = ???
foo.asBigQuery // List[Field]
foo.asSparkSchema // StructType
foo.asCassandra("TableName", "primaryKey") // CreateTable
Conversion between types works in the same way
// From Spark to others
val foo: StructType = myDataFrame.schema
foo.asBigQuery // List[Field]
foo.asCassandra("TableName", "primaryKey") // CreateTable
//From BigQuery to others
val foo: Schema = ???
foo.asSparkFields // List[StructField]
foo.asSparkSchema // StructType
foo.asCassandra("TableName", "primaryKey") // CreateTable
//From Cassandra to others
val foo: CreateTable = ???
foo.asSparkFields // List[StructField]
foo.asSparkSchema // StructType
foo.asBigQuery // List[Field]
foo.asBigQuery.schema // Schema