MODL: A Bayes optimal discretization method for continuous attributes, While real data often comes in mixed format, discrete and continuous, many supervised induction algorithms require discrete data. Efficient discretization of continuous attributes is an important problem that has effects on speed, accuracy and understandability of the induction models. In this paper, we propose a new discretization method MODL, founded on a Bayesian approach. We introduce a space of discretization models and a prior distribution defined on this model space. This results in the definition of a Bayes optimal evaluation criterion of discretizations. We then propose a new super-linear optimization algorithm that manages to find near-optimal discretizations. Extensive comparative experiments both on real and synthetic data demonstrate the high inductive performances obtained by the new discretization method.
Keywords for this software
References in zbMATH (referenced in 10 articles )
Showing results 1 to 10 of 10.
- Boullé, Marc; Charnay, Clément; Lachiche, Nicolas: A scalable robust and automatic propositionalization approach for Bayesian classification of large mixed numerical and categorical data (2019)
- Franc, Vojtech; Fikar, Ondrej; Bartos, Karel; Sofka, Michal: Learning data discretization via convex optimization (2018)
- Sang, Yu; Qi, Heng; Li, Keqiu; Jin, Yingwei; Yan, Deqin; Gao, Shusheng: An effective discretization method for disposing high-dimensional data (2014)
- Zeinalkhani, Mohsen; Eftekhari, Mahdi: Fuzzy partitioning of continuous attributes through discretization methods to construct fuzzy decision tree classifiers (2014)
- Wu, ChienHsing; Kao, Shu-Chen; Okuhara, Koji: Examination and comparison of conflicting data in granulated datasets: equal width interval vs. equal frequency interval (2013) ioport
- Bondu, Alexis; Boullé, Marc; Lemaire, Vincent: A non-parametric semi-supervised discretization method (2010) ioport
- Boullé, Marc: Optimum simultaneous discretization with data grid models in supervised classification: a Bayesian model selection approach (2009) ioport
- Boullé, Marc: Optimum simultaneous discretization with data grid models in supervised classification: a Bayesian model selection approach (2009)
- Jin, Ruoming; Breitbart, Yuri; Muoh, Chibuike: Data discretization unification (2009) ioport
- Boullé, Marc: MODL: A Bayes optimal discretization method for continuous attributes (2006) ioport