Apache Spark

Apache Spark: Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Spark Streaming for stream processing.


References in zbMATH (referenced in 44 articles )

Showing results 1 to 20 of 44.
Sorted by year (citations)

1 2 3 next

  1. Dmitry Soshnikov, Yana Valieva: mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing (2021) arXiv
  2. Dong, Bin; Wu, Kesheng; Byna, Suren: User-defined tensor data analysis (to appear) (2021)
  3. Kappelman, Ashton Conrad; Sinha, Ashesh Kumar: Optimal control in dynamic food supply chains using big data (2021)
  4. Maté, Carlos G.: Combining interval time series forecasts. A first step in a long way (research agenda) (2021)
  5. Ahmadi, Mohsen; Abadi, Moein Qaisari Hasan: A review of using object-orientation properties of C++ for designing expert system in strategic planning (2020)
  6. Anil, Robin; Capan, Gokhan; Drost-Fromm, Isabel; Dunning, Ted; Friedman, Ellen; Grant, Trevor; Quinn, Shannon; Ranjan, Paritosh; Schelter, Sebastian; Yılmazel, Özgür: Apache Mahout: machine learning on distributed dataflow systems (2020)
  7. Feng, Jun; Yang, Laurence T.; Gati, Nicholaus J.; Xie, Xia; Gavuna, Benard S.: Privacy-preserving computation in cyber-physical-social systems: a survey of the state-of-the-art and perspectives (2020)
  8. Kalina, Jan; Vidnerová, Petra: Regression neural networks with a highly robust loss function (2020)
  9. Ketsman, Bas; Albarghouthi, Aws; Koutris, Paraschos: Distribution policies for Datalog (2020)
  10. Lu, Haihao; Mazumder, Rahul: Randomized gradient boosting machine (2020)
  11. Ren, Pengfei; Dai, Hao; Chen, Weisheng: Distributed cooperative learning over time-varying random networks using a gossip-based communication protocol (2020)
  12. Salehi, Abbas; Masoumi, Behrooz: KATZ centrality with biogeography-based optimization for influence maximization problem (2020)
  13. Sambasivan, Rajiv; Das, Sourish; Sahu, Sujit K.: A Bayesian perspective of statistical machine learning for big data (2020)
  14. Yuan, Xiao-Tong; Li, Ping: On convergence of distributed approximate Newton methods: globalization, sharper bounds and beyond (2020)
  15. Liu, Heng; Ditzler, Gregory: A semi-parallel framework for greedy information-theoretic feature selection (2019)
  16. Pan, Xianli; Xu, Yitian: A safe reinforced feature screening strategy for Lasso based on feasible solutions (2019)
  17. Raissi, Maziar; Babaee, Hessam; Karniadakis, George Em: Parametric Gaussian process regression for big data (2019)
  18. Rodrigo, Enrique G.; Aledo, Juan A.; Gámez, José A.: spark-crowd: a spark package for learning from crowdsourced big data (2019)
  19. Rompf, Tiark; Amin, Nada: A SQL to C compiler in 500 lines of code (2019)
  20. Roy, Asim; Qureshi, Shiban; Pande, Kartikeya; Nair, Divitha; Gairola, Kartik; Jain, Pooja; Singh, Suraj; Sharma, Kirti; Jagadale, Akshay; Lin, Yi-Yang; Sharma, Shashank; Gotety, Ramya; Zhang, Yuexin; Tang, Ji; Mehta, Tejas; Sindhanuru, Hemanth; Okafor, Nonso; Das, Santak; Gopal, Chidambara N.; Rudraraju, Srinivasa B.; Kakarlapudi, Avinash V.: Performance comparison of machine learning platforms (2019)

1 2 3 next