SLIQ: A fast scalable classifier for data mining. Classification is an important problem in the emerging field of data mining. Although classification has been studied extensively in the past, most of the classification algorithms are designed only for memory-resident data, thus limiting their suitability for data mining large data sets. This paper discusses issues in building a scalable classifier and presents the design of SLIQ, a new classifier. SLIQ is a decision tree classifier that can handle both numeric and categorical attributes. It uses a novel pre-sorting technique in the tree-growth phase. This sorting procedure is integrated with a breadth-first tree growing strategy to enable classification of disk-resident datasets. SLIQ also uses a new tree-pruning algorithm that is inexpensive, and results in compact and accurate trees. The combination of these techniques enables SLIQ to scale for large data sets and classify data sets irrespective of the number of classes, attributes, and examples (records), thus making it an attractive tool for data mining.

References in zbMATH (referenced in 42 articles )

Showing results 1 to 20 of 42.
Sorted by year (citations)

1 2 3 next

  1. Rokach, Lior; Maimon, Oded: Data mining with decision trees. Theory and applications. (2015)
  2. Baralis, Elena; Cagliero, Luca; Cerquitelli, Tania; D’Elia, Vincenzo; Garza, Paolo: Expressive generalized itemsets (2014)
  3. Khoshgoftaar, Taghi M.; Xiao, Yudong; Gao, Kehan: Software quality assessment using a multi-strategy classifier (2014)
  4. Nasridinov, Aziz; Lee, Yangsun; Park, Young-Ho: Decision tree construction on GPU: ubiquitous parallel computing approach (2014)
  5. Stojanova, Daniela; Ceci, Michelangelo; Appice, Annalisa; Džeroski, Sašo: Network regression with predictive clustering trees (2012)
  6. Vreeken, Jilles; Van Leeuwen, Matthijs; Siebes, Arno: Krimp: mining itemsets that compress (2011)
  7. Bifet, Albert: Adaptive stream mining: Pattern learning and mining from evolving data streams. (2010)
  8. Chandra, B.; Kothari, Ravi; Paul, Pallath: A new node splitting measure for decision tree construction (2010)
  9. Chandra, B.; Varghese, P.Paul: Moving towards efficient decision tree construction (2009)
  10. Glimcher, Leonid; Jin, Ruoming; Agrawal, Gagan: Middleware for data mining applications on clusters and grids (2008)
  11. Gonzáles-Aranda, P.; Menasalvas, E.; Millán, S.; Ruiz, Carlos; Segovia, J.: Towards a methodology for data mining project development: the importance of abstraction (2008)
  12. Hu, Hui-Ling; Chen, Yen-Liang: Mining typical patterns from databases (2008)
  13. Castro, José; Secretan, Jimmy; Georgiopoulos, Michael; DeMara, Ronald; Anagnostopoulos, Georgios; Gonzalez, Avelino: Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation (2007)
  14. Faloutsos, Christos; Megalooikonomou, Vasileios: On data mining, compression, and Kolmogorov complexity (2007)
  15. Yen, Ester; Chu, I-Wen Mike: Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees (2007)
  16. Kooptiwoot, S.; Salam, M.A.: IUI mining: human expert guidance of information theoretic network approach (2006)
  17. Nguyen, Hung Son: Approximate Boolean reasoning: Foundations and applications in data mining (2006)
  18. Wu, Xintao: Incorporating large unlabeled data to enhance EM classification (2006)
  19. Wu, Xintao: Incorporating large unlabeled data to enhance EM classification (2006)
  20. Zaki, Mohammed J.; Aggarwal, Charu C.: XRules: An effective algorithm for structural classification of XML data (2006)

1 2 3 next