SPRINT: a scalable parallel classifier for data mining. Classification is an important data mining problem. Although classification is a well-studied problem, most of the current classification algorithms require that all or a portion of the the entire dataset remain permanently in memory. This limits their suitability for mining over large databases. We present a new decision-tree-based classification algorithm, called SPRINT that removes all of the memory restrictions, and is fast and scalable. The algorithm has also been designed to be easily parallelized, allowing many processors to work together to build a single consistent model. This parallelization, also presented here, exhibits excellent scalability as well. The combination of these characteristics makes the proposed algorithm an ideal tool for data mining.

References in zbMATH (referenced in 36 articles )

Showing results 1 to 20 of 36.
Sorted by year (citations)

1 2 next

  1. Wang, Ran; He, Yu-Lin; Chow, Chi-Yin; Ou, Fang-Fang; Zhang, Jian: Learning ELM-tree from big data based on uncertainty reduction (2015)
  2. De Stefano, C.; Folino, G.; Fontanella, F.; Scotto di Freca, A.: Using Bayesian networks for selecting classifiers in GP ensembles (2014) ioport
  3. Gama, João; Žliobaitė, Indrė; Bifet, Albert; Pechenizkiy, Mykola; Bouchachia, Abdelhamid: A survey on concept drift adaptation (2014)
  4. Khoshgoftaar, Taghi M.; Xiao, Yudong; Gao, Kehan: Software quality assessment using a multi-strategy classifier (2014) ioport
  5. Nasridinov, Aziz; Lee, Yangsun; Park, Young-Ho: Decision tree construction on GPU: ubiquitous parallel computing approach (2014) ioport
  6. Bifet, Albert: Adaptive stream mining: Pattern learning and mining from evolving data streams. (2010)
  7. Kwiatkowski, Piotr; Nguyen, Sinh Hoa; Nguyen, Hung Son: On scalability of rough set methods (2010)
  8. Shih, Wen-Chung; Yang, Chao-Tung; Tseng, Shian-Shyong: Performance-based data distribution for data mining applications on grid computing environments (2010) ioport
  9. Chandra, B.; Varghese, P.Paul: Moving towards efficient decision tree construction (2009)
  10. Elnaffar, Said; Martin, Pat; Schiefer, Berni; Lightstone, Sam: Is it DSS or OLTP: Automatically identifying DBMS workloads (2008) ioport
  11. Glimcher, Leonid; Jin, Ruoming; Agrawal, Gagan: Middleware for data mining applications on clusters and grids (2008) ioport
  12. Hu, Hui-Ling; Chen, Yen-Liang: Mining typical patterns from databases (2008) ioport
  13. Castro, José; Secretan, Jimmy; Georgiopoulos, Michael; DeMara, Ronald; Anagnostopoulos, Georgios; Gonzalez, Avelino: Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation (2007)
  14. Osei-Bryson, Kweku-Muata: Post-pruning in decision tree induction using multiple performance measures (2007)
  15. Yen, Ester; Chu, I-Wen Mike: Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees (2007) ioport
  16. Nguyen, Hung Son: Approximate Boolean reasoning: Foundations and applications in data mining (2006)
  17. Wu, Xintao: Incorporating large unlabeled data to enhance EM classification (2006) ioport
  18. Wu, Xintao: Incorporating large unlabeled data to enhance EM classification (2006) ioport
  19. Zaki, Mohammed J.; Aggarwal, Charu C.: XRules: An effective algorithm for structural classification of XML data (2006) ioport
  20. Agrawal, Rakesh; Gehrke, Johannes; Gunopulos, Dimitrios; Raghavan, Prabhakar: Automatic subspace clustering of high dimensional data (2005) ioport

1 2 next