The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

References in zbMATH (referenced in 120 articles )

Showing results 1 to 20 of 120.
Sorted by year (citations)

1 2 3 4 5 6 next

  1. Ahmadi, Saba; Khuller, Samir; Purohit, Manish; Yang, Sheng: On scheduling coflows (2020)
  2. Czumaj, Artur; Łącki, Jakub; Mądry, Aleksander; Mitrović, Slobodan; Onak, Krzysztof; Sankowski, Piotr: Round compression for parallel matching algorithms (2020)
  3. Feldman, Dan; Schmidt, Melanie; Sohler, Christian: Turning big data into tiny data: constant-size coresets for (k)-means, PCA, and projective clustering (2020)
  4. Montealegre, P.; Perez-Salazar, S.; Rapaport, I.; Todinca, I.: Graph reconstruction in the congested clique (2020)
  5. Zhang, Longxin; Zhou, Liqian; Salah, Ahmad: Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments (2020)
  6. Dhaenens, Clarisse; Jourdan, Laetitia: Metaheuristics for data mining (2019)
  7. Farruggia, Andrea; Ferragina, Paolo; Frangioni, Antonio; Venturini, Rossano: Bicriteria data compression (2019)
  8. Ramasubramanian, Karthik; Singh, Abhishek: Machine learning using R. With time series and industry-based use cases in R (2019)
  9. Roy, Asim; Qureshi, Shiban; Pande, Kartikeya; Nair, Divitha; Gairola, Kartik; Jain, Pooja; Singh, Suraj; Sharma, Kirti; Jagadale, Akshay; Lin, Yi-Yang; Sharma, Shashank; Gotety, Ramya; Zhang, Yuexin; Tang, Ji; Mehta, Tejas; Sindhanuru, Hemanth; Okafor, Nonso; Das, Santak; Gopal, Chidambara N.; Rudraraju, Srinivasa B.; Kakarlapudi, Avinash V.: Performance comparison of machine learning platforms (2019)
  10. Yadu Babuji, Anna Woodard, Zhuozhao Li, Daniel S. Katz, Ben Clifford, Rohan Kumar, Lukasz Lacinski, Ryan Chard, Justin M. Wozniak, Ian Foster, Michael Wilde, Kyle Chard: Parsl: Pervasive Parallel Programming in Python (2019) arXiv
  11. Yin, Chao; Lv, Haitao; Li, Tongfang; Qu, Xiaoping; Wang, Jianzong; Gao, Guangyong: A new minimize matrix computation coding method for distributed storage systems (2019)
  12. Alexander Foss; Marianthi Markatou: kamila: Clustering Mixed-Type Data in R and Hadoop (2018) not zbMATH
  13. Alnasir, Jamie J.; Shanahan, Hugh P.: Transcriptomics: quantifying non-uniform read distribution using MapReduce (2018)
  14. Arleo, Alessio; Didimo, Walter; Liotta, Giuseppe; Montecchiani, Fabrizio: GiVip: a visual profiler for distributed graph processing systems (2018)
  15. Bateni, Mohammadhossein; Behnezhad, Soheil; Derakhshan, Mahsa; Hajiaghayi, Mohammadtaghi; Mirrokni, Vahab: Brief announcement: mapreduce algorithms for massive trees (2018)
  16. Gonen, Yaron; Gudes, Ehud; Kandalov, Kirill: New and efficient algorithms for producing frequent itemsets with the Map-Reduce framework (2018)
  17. Haller, Philipp; Miller, Heather; Müller, Normen: A programming model and foundation for lineage-based distributed computation (2018)
  18. Kocsis, Zoltan A.; Swan, Jerry: Genetic programming (+) proof search (=) automatic improvement (2018)
  19. Lau, F. Din-Houn; Adams, Niall M.; Girolami, Mark A.; Butler, Liam J.; Elshafie, Mohammed Z. E. B.: The role of statistics in data-centric engineering (2018)
  20. Mohammed, Assem H.; Gadallah, Ahmed M.; Hefny, Hesham A.; Hazman, M.: Fuzzy based approach for discovering crops plantation knowledge from huge agro-climatic data respecting climate changes (2018)

1 2 3 4 5 6 next