Apache Spark
Apache Spark: Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Spark Streaming for stream processing.
Keywords for this software
References in zbMATH (referenced in 59 articles )
Showing results 1 to 20 of 59.
Sorted by year (- Iwasaki, Hideya; Emoto, Kento; Morihata, Akimasa; Matsuzaki, Kiminori; Hu, Zhenjiang: Fregel: a functional domain-specific language for vertex-centric large-scale graph processing (2022)
- Nanongkai, Danupon; Scquizzato, Michele: Equivalence classes and conditional hardness in massively parallel computations (2022)
- Ahlawat, Khyati; Chug, Anuradha; Singh, Amit Prakash: A novel hybrid sampling algorithm for solving class imbalance problem in big data (2021)
- Avalos, Omar: GSA for machine learning problems: a comprehensive overview (2021)
- Azhari, Mourad; Abarda, Abdallah; Ettaki, Badia; Zerouaoui, Jamal; Dakkon, Mohamed: Using machine learning with PySpark and MLib for solving a binary classification problem: case of searching for exotic particles (2021)
- Berthold, Michael R.; Fillbrunn, Alexander; Siebes, Arno: Widening: using parallel resources to improve model quality (2021)
- Chelly, Dagdia Zaineb; Zarges, Christine: A detailed study of the distributed rough set based locality sensitive hashing feature selection technique (2021)
- Dmitry Soshnikov, Yana Valieva: mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing (2021) arXiv
- Dong, Bin; Wu, Kesheng; Byna, Suren: User-defined tensor data analysis (2021)
- Dutta, R., Schoengens, M., Pacchiardi, L., Ummadisingu, A., Widmer, N., Künzli, P., Onnela, J.-P., Mira, A: ABCpy: A High-Performance Computing Perspective to Approximate Bayesian Computation (2021) not zbMATH
- Fernandez-Basso, Carlos; Ruiz, M. Dolores; Martin-Bautista, Maria J.: Spark solutions for discovering fuzzy association rules in big data (2021)
- Gong, Chaoyu; Su, Zhi-gang; Wang, Pei-hong; Wang, Qian; You, Yang: Evidential instance selection for (K)-nearest neighbor classification of big data (2021)
- Kappelman, Ashton Conrad; Sinha, Ashesh Kumar: Optimal control in dynamic food supply chains using big data (2021)
- Maté, Carlos G.: Combining interval time series forecasts. A first step in a long way (research agenda) (2021)
- Młodak, Andrzej: (k)-means, Ward and probabilistic distance-based clustering methods with contiguity constraint (2021)
- Tayarani N., Mohammad-H.: Applications of artificial intelligence in battling against COVID-19: a literature review (2021)
- Zhu, Xuening; Li, Feng; Wang, Hansheng: Least-square approximation for a distributed system (2021)
- Anil, Robin; Capan, Gokhan; Drost-Fromm, Isabel; Dunning, Ted; Friedman, Ellen; Grant, Trevor; Quinn, Shannon; Ranjan, Paritosh; Schelter, Sebastian; Yılmazel, Özgür: Apache Mahout: machine learning on distributed dataflow systems (2020)
- Feng, Jun; Yang, Laurence T.; Gati, Nicholaus J.; Xie, Xia; Gavuna, Benard S.: Privacy-preserving computation in cyber-physical-social systems: a survey of the state-of-the-art and perspectives (2020)
- Kalina, Jan; Vidnerová, Petra: Regression neural networks with a highly robust loss function (2020)