SciMATE: A novel MapReduce-like framework for multiple scientific data formats. Despite the popularity of MapReduce, there are several obstacles to applying it for developing scientific data analysis applications. Current MapReduce implementations require that data be loaded into specialized file systems, like the Hadoop Distributed File System (HDFS), whereas with rapidly growing size of scientific datasets, reloading data in another file system or format is not feasible. We present a framework that allows scientific data in different formats to be processed with a MapReduce like API. Our system is referred to as SciMATE, and is based on the MATE system developed at Ohio State. SciMATE is developed as a customizable system, which can be adapted to support processing on any of the scientific data formats. We have demonstrated the functionality of our system by creating instances that can be processing NetCDF and HDF5 formats as well as flat-files. We have also implemented three popular data mining applications and have evaluated their execution with each of the three instances of our system.

Keywords for this software

Anything in here will be replaced on browsers that support the canvas element

References in zbMATH (referenced in 1 article )

Showing result 1 of 1.
Sorted by year (citations)

  1. Choi, Woohyuk; Hong, Sumin; Jeong, Won-Ki: Vispark: GPU-accelerated distributed visual computing using Spark (2016)