SLIQ: A fast scalable classifier for data mining. Classification is an important problem in the emerging field of data mining. Although classification has been studied extensively in the past, most of the classification algorithms are designed only for memory-resident data, thus limiting their suitability for data mining large data sets. This paper discusses issues in building a scalable classifier and presents the design of SLIQ, a new classifier. SLIQ is a decision tree classifier that can handle both numeric and categorical attributes. It uses a novel pre-sorting technique in the tree-growth phase. This sorting procedure is integrated with a breadth-first tree growing strategy to enable classification of disk-resident datasets. SLIQ also uses a new tree-pruning algorithm that is inexpensive, and results in compact and accurate trees. The combination of these techniques enables SLIQ to scale for large data sets and classify data sets irrespective of the number of classes, attributes, and examples (records), thus making it an attractive tool for data mining.

References in zbMATH (referenced in 46 articles )

Showing results 1 to 20 of 46.
Sorted by year (citations)

1 2 3 next

  1. Altay, Ayca; Cinar, Didem: Fuzzy decision trees (2016)
  2. Rokach, Lior; Maimon, Oded: Data mining with decision trees. Theory and applications. (2015)
  3. Baralis, Elena; Cagliero, Luca; Cerquitelli, Tania; D’Elia, Vincenzo; Garza, Paolo: Expressive generalized itemsets (2014)
  4. Gama, João; Žliobaitė, Indrė; Bifet, Albert; Pechenizkiy, Mykola; Bouchachia, Abdelhamid: A survey on concept drift adaptation (2014)
  5. Khoshgoftaar, Taghi M.; Xiao, Yudong; Gao, Kehan: Software quality assessment using a multi-strategy classifier (2014) ioport
  6. Nasridinov, Aziz; Lee, Yangsun; Park, Young-Ho: Decision tree construction on GPU: ubiquitous parallel computing approach (2014) ioport
  7. Stojanova, Daniela; Ceci, Michelangelo; Appice, Annalisa; Džeroski, Sašo: Network regression with predictive clustering trees (2012)
  8. Salehi-Moghaddami, Nima; Yazdi, Hadi Sadoghi; Poostchi, Hanieh: Correlation based splitting criterionin multi branch decision tree (2011)
  9. Vreeken, Jilles; Van Leeuwen, Matthijs; Siebes, Arno: Krimp: mining itemsets that compress (2011)
  10. Bifet, Albert: Adaptive stream mining: Pattern learning and mining from evolving data streams. (2010)
  11. Chandra, B.; Kothari, Ravi; Paul, Pallath: A new node splitting measure for decision tree construction (2010)
  12. Chandra, B.; Varghese, P.Paul: Moving towards efficient decision tree construction (2009)
  13. Glimcher, Leonid; Jin, Ruoming; Agrawal, Gagan: Middleware for data mining applications on clusters and grids (2008) ioport
  14. Gonzáles-Aranda, P.; Menasalvas, E.; Millán, S.; Ruiz, Carlos; Segovia, J.: Towards a methodology for data mining project development: the importance of abstraction (2008)
  15. Hu, Hui-Ling; Chen, Yen-Liang: Mining typical patterns from databases (2008) ioport
  16. Castro, José; Secretan, Jimmy; Georgiopoulos, Michael; DeMara, Ronald; Anagnostopoulos, Georgios; Gonzalez, Avelino: Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation (2007)
  17. Faloutsos, Christos; Megalooikonomou, Vasileios: On data mining, compression, and Kolmogorov complexity (2007) ioport
  18. Yen, Ester; Chu, I-Wen Mike: Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees (2007) ioport
  19. Kooptiwoot, S.; Salam, M.A.: IUI mining: human expert guidance of information theoretic network approach (2006) ioport
  20. Nguyen, Hung Son: Approximate Boolean reasoning: Foundations and applications in data mining (2006)

1 2 3 next