redislabs / spark-redis   3.1.0

BSD 3-clause "New" or "Revised" License GitHub

A connector for Spark that allows reading and writing to/from Redis cluster

Scala versions: 2.12 2.11

license Release CircleCI Maven Central Javadocs Discord Codecov

Spark-Redis

A library for reading and writing data in Redis using Apache Spark.

Spark-Redis provides access to all of Redis' data structures - String, Hash, List, Set and Sorted Set - from Spark as RDDs. It also supports reading and writing with DataFrames and Spark SQL syntax.

The library can be used both with Redis stand-alone as well as clustered databases. When used with Redis cluster, Spark-Redis is aware of its partitioning scheme and adjusts in response to resharding and node failure events.

Spark-Redis also supports Spark Streaming (DStreams) and Structured Streaming.

Version compatibility and branching

The library has several branches, each corresponds to a different supported Spark version. For example, 'branch-2.3' works with any Spark 2.3.x version. The master branch contains the recent development for the next release.

Spark-Redis Spark Redis Supported Scala Versions
master 3.2.x >=2.9.0 2.12
3.0 3.0.x >=2.9.0 2.12
2.4, 2.5, 2.6 2.4.x >=2.9.0 2.11, 2.12
2.3 2.3.x >=2.9.0 2.11
1.4 1.4.x 2.10

Known limitations

  • Java, Python and R API bindings are not provided at this time

Additional considerations

This library is a work in progress so the API may change before the official release.

Documentation

Please make sure you use documentation from the correct branch (2.4, 2.3, etc).

Contributing

You're encouraged to contribute to the Spark-Redis project.

There are two ways you can do so:

Submit Issues

If you encounter an issue while using the library, please report it via the project's issues tracker.

Author Pull Requests

Code contributions to the Spark-Redis project can be made using pull requests. To submit a pull request:

  1. Fork this project.
  2. Make and commit your changes.
  3. Submit your changes as a pull request.