HadoopOffice is not maintained anymore.
I recommend to use the Open Document Format (ODF) for processing of office documents and do this outside a Big Data platform. It is a vendor-independent standard format that can be processed by many different technologies.
HadoopOffice - Analyze and write Office documents, such as MS Excel, using the Hadoop ecosystem including Apache Hive/Apache Flink/Apache Spark.
You find more information about the project in the Wiki.