Dryad

Dryad: distributed data-parallel programs from sequential building blocks. Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad application combines computational ”vertices” with communication ”channels” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of available computers, communicating as appropriate through flies, TCP pipes, and shared-memory FIFOs. The vertices provided by the application developer are quite simple and are usually written as sequential programs with no thread creation or locking. Concurrency arises from Dryad scheduling vertices to run simultaneously on multiple computers, or on multiple CPU cores within a computer. The application can discover the size and placement of data at run time, and modify the graph as the computation progresses to make efficient use of the available resources. Dryad is designed to scale from powerful multi-core single computers, through small clusters of computers, to data centers with thousands of computers. The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.

This software is also peer reviewed by journal TOMS.


References in zbMATH (referenced in 15 articles )

Showing results 1 to 15 of 15.
Sorted by year (citations)

  1. Li, Xiaoyan; Fan, Jianxi; Lin, Cheng-Kuan; Cheng, Baolei; Jia, Xiaohua: The extra connectivity, extra conditional diagnosability and (t/k)-diagnosability of the data center network DCell (2019)
  2. Pericini, Matheus H. M.; Leite, Lucas G. M.; De Carvalho-Junior, Francisco H.; Machado, Javam C.; Rezende, Cenez A.: \textscMAPSkew: metaheuristic approaches for partitioning skew in MapReduce (2019)
  3. Rompf, Tiark; Amin, Nada: A SQL to C compiler in 500 lines of code (2019)
  4. Convolbo, Moïse W.; Chou, Jerry; Hsu, Ching-Hsien; Chung, Yeh Ching: GEODIS: towards the optimization of data locality-aware job scheduling in geo-distributed data centers (2018)
  5. Haller, Philipp; Miller, Heather; Müller, Normen: A programming model and foundation for lineage-based distributed computation (2018)
  6. Mishra, Deepa; Gunasekaran, Angappa; Papadopoulos, Thanos; Childe, Stephen J.: Big data and supply chain management: a review and bibliometric analysis (2018)
  7. Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, Ion Stoica: Ray: A Distributed Framework for Emerging AI Applications (2017) arXiv
  8. Wang, Xi; Fan, Jianxi; Jia, Xiaohua; Lin, Cheng-Kuan: An efficient algorithm to construct disjoint path covers of DCell networks (2016)
  9. Philip Chen, C. L.; Zhang, Chun-Yang: Data-intensive applications, challenges, techniques and technologies: a survey on big data (2014) ioport
  10. Ahmad, Faraz; Lee, Seyong; Thottethodi, Mithuna; Vijaykumar, T. N.: MapReduce with communication overlap (MaRCO) (2013) ioport
  11. Han, Liangxiu; Liew, Chee Sun; van Hemert, Jano; Atkinson, Malcolm: A generic parallel processing model for facilitating data mining and integration (2011) ioport
  12. Nicolae, Bogdan; Antoniu, Gabriel; Bougé, Luc; Moise, Diana; Carpen-Amarie, Alexandra: BlobSeer: Next-generation data management for large scale infrastructures (2011) ioport
  13. Wilde, Michael; Hategan, Mihael; Wozniak, Justin M.; Clifford, Ben; Katz, Daniel S.; Foster, Ian: Swift: A language for distributed parallel scripting (2011) ioport
  14. Raicu, Ioan; Foster, Ian; Wilde, Mike; Zhang, Zhao; Iskra, Kamil; Beckman, Peter; Zhao, Yong; Szalay, Alex; Choudhary, Alok; Little, Philip: Middleware support for many-task computing (2010) ioport
  15. Burrows, Eva; Haveraaen, Magne: A hardware independent parallel programming model (2009)