zuinnote / hadoopoffice   1.5.0

Apache License 2.0 GitHub

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

Scala versions: 2.12 2.11

HadoopOffice is not maintained anymore

HadoopOffice is not maintained anymore.

I recommend to use the Open Document Format (ODF) for processing of office documents and do this outside a Big Data platform. It is a vendor-independent standard format that can be processed by many different technologies.

hadoopoffice

HadoopOffice - Analyze and write Office documents, such as MS Excel, using the Hadoop ecosystem including Apache Hive/Apache Flink/Apache Spark.

You find more information about the project in the Wiki.