Distributed processing of very large datasets with DataCutter. We describe a framework, called DataCutter, that is designed to provide support for subsetting and processing of datasets in a distributed and heterogeneous environment. We illustrate the use of DataCutter with several data-intensive applications from diverse fields, and present experimental results.

References in zbMATH (referenced in 20 articles , 1 standard article )

Showing results 1 to 20 of 20.
Sorted by year (citations)

  1. Benoit, Anne; Gallet, Matthieu; Gaujal, Bruno; Robert, Yves: Computing the throughput of probabilistic and replicated streaming applications (2014)
  2. Agrawal, Kunal; Benoit, Anne; Dufossé, Fanny; Robert, Yves: Mapping filtering streaming applications (2012)
  3. Hartley, Timothy D.R.; Saule, Erik; Çatalyürek, Ümit V.: Improving performance of adaptive component-based dataflow middleware (2012)
  4. Saule, Erik; Baş, Erdeniz Ö.; Çatalyürek, Ümit V.: Load-balancing spatially located computations using rectangular partitions (2012)
  5. Andrade, H.; Gedik, B.; Wu, K.-L.; Yu, P.S.: Processing high data rate streams in System S (2011)
  6. Han, Liangxiu; Liew, Chee Sun; van Hemert, Jano; Atkinson, Malcolm: A generic parallel processing model for facilitating data mining and integration (2011)
  7. Benoit, Anne; Robert, Yves: Complexity results for throughput and latency optimization of replicated and data-parallel workflows (2010)
  8. Kumar, Vijay S.; Kurc, Tahsin; Ratnakar, Varun; Kim, Jihie; Mehta, Gaurang; Vahi, Karan; Nelson, Yoonju Lee; Sadayappan, P.; Deelman, Ewa; Gil, Yolanda: Parameterized specification, configuration and execution of data-intensive scientific workflows (2010)
  9. Briquet, Cyril; Dalem, Xavier; Jodogne, Sébastien; de Marneffe, Pierre-Arnoul: P2P file sharing for P2P computing (2009)
  10. Benoit, Anne; Robert, Yves: Mapping pipeline skeletons onto heterogeneous platforms (2008)
  11. Braga Araújo, Renata; Trielli Ferreira, Guilherme Henrique; Orair, Gustavo Henrique; Meira, Wagner; Celso Ferreira, Renato Ant^onio; Olavo Guedes Neto, Dorgival; Zaki, Mohammed Javeed: The ParTriCluster algorithm for gene expression analysis (2008)
  12. Klie, H.; Bangerth, W.; Gai, X.; Wheeler, M.F.; Stoffa, P.L.; Sen, M.; Parashar, M.; Catalyurek, U.; Saltz, J.; Kurc, T.: Models, methods and middleware for grid-enabled multiphysics oil reservoir management (2006)
  13. Oldfield, Ron; Kotz, David: Improving data access for computational grid applications (2006)
  14. Beynon, Michael D.; Kurc, Tahsin; Sussman, Alan; Saltz, Joel: Optimizing execution of component-based applications using group instances (2002)
  15. Czajkowski, Karl; Foster, Ian; Kesselman, Carl; Sander, Volker; Tuecke, Steven: SNAP: A protocol for negotiating service level agreements and coordinating resource management in distributed systems (2002)
  16. Diamessis, Peter; Kerney, William; Baden, Scott B.; Nomura, Keiko: Automated tracking of 3-D overturn patches in direct numerical simulation of stratified homogeneous turbulence (2002)
  17. Skillicorn, David; Talia, Domenico: Mining large data sets on grids: Issues and prospects (2002)
  18. Beynon, M.D.; Kurc, T.; Catalyurek, U.; Chang, C.; Sussman, A.: Distributed processing of very large datasets with DataCutter (2001)
  19. Foster, Ian: The anatomy of the Grid: Enabling scalable virtual organizations (2001)
  20. Nikolow, Darin; Słota, Renata; Kitowski, Jacek; Nyczyk, Piotr; Otfinowski, Janusz: Tertiary storage system for index-based retrieving of video sequences (2001)