SPRINT: a scalable parallel classifier for data mining. Classification is an important data mining problem. Although classification is a well-studied problem, most of the current classification algorithms require that all or a portion of the the entire dataset remain permanently in memory. This limits their suitability for mining over large databases. We present a new decision-tree-based classification algorithm, called SPRINT that removes all of the memory restrictions, and is fast and scalable. The algorithm has also been designed to be easily parallelized, allowing many processors to work together to build a single consistent model. This parallelization, also presented here, exhibits excellent scalability as well. The combination of these characteristics makes the proposed algorithm an ideal tool for data mining.

References in zbMATH (referenced in 38 articles )

Showing results 1 to 20 of 38.
Sorted by year (citations)

1 2 next

  1. Hassani, Hossein; Huang, Xu; Silva, Emmanuel S.; Ghodsi, Mansi: A review of data mining applications in crime (2016)
  2. Wang, Ran; He, Yu-Lin; Chow, Chi-Yin; Ou, Fang-Fang; Zhang, Jian: Learning ELM-tree from big data based on uncertainty reduction (2015)
  3. De Stefano, C.; Folino, G.; Fontanella, F.; Scotto di Freca, A.: Using Bayesian networks for selecting classifiers in GP ensembles (2014) ioport
  4. Gama, João; Žliobaitė, Indrė; Bifet, Albert; Pechenizkiy, Mykola; Bouchachia, Abdelhamid: A survey on concept drift adaptation (2014)
  5. Khoshgoftaar, Taghi M.; Xiao, Yudong; Gao, Kehan: Software quality assessment using a multi-strategy classifier (2014) ioport
  6. Nasridinov, Aziz; Lee, Yangsun; Park, Young-Ho: Decision tree construction on GPU: ubiquitous parallel computing approach (2014) ioport
  7. Bifet, Albert: Adaptive stream mining: Pattern learning and mining from evolving data streams. (2010)
  8. Kwiatkowski, Piotr; Nguyen, Sinh Hoa; Nguyen, Hung Son: On scalability of rough set methods (2010)
  9. Shih, Wen-Chung; Yang, Chao-Tung; Tseng, Shian-Shyong: Performance-based data distribution for data mining applications on grid computing environments (2010) ioport
  10. Chandra, B.; Varghese, P. Paul: Moving towards efficient decision tree construction (2009)
  11. Popova, E. A.: Method for parallel construction of a committee of decision tree for processing the electroencephalography signals (2009)
  12. Elnaffar, Said; Martin, Pat; Schiefer, Berni; Lightstone, Sam: Is it DSS or OLTP: Automatically identifying DBMS workloads (2008) ioport
  13. Glimcher, Leonid; Jin, Ruoming; Agrawal, Gagan: Middleware for data mining applications on clusters and grids (2008) ioport
  14. Hu, Hui-Ling; Chen, Yen-Liang: Mining typical patterns from databases (2008) ioport
  15. Castro, José; Secretan, Jimmy; Georgiopoulos, Michael; DeMara, Ronald; Anagnostopoulos, Georgios; Gonzalez, Avelino: Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation (2007)
  16. Osei-Bryson, Kweku-Muata: Post-pruning in decision tree induction using multiple performance measures (2007)
  17. Yen, Ester; Chu, I-Wen Mike: Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees (2007) ioport
  18. Nguyen, Hung Son: Approximate Boolean reasoning: Foundations and applications in data mining (2006)
  19. Wu, Xintao: Incorporating large unlabeled data to enhance EM classification (2006) ioport
  20. Wu, Xintao: Incorporating large unlabeled data to enhance EM classification (2006) ioport

1 2 next