MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. MALLET includes sophisticated tools for document classification: efficient routines for converting text to ”features”, a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics. In addition to classification, MALLET includes tools for sequence tagging for applications such as named-entity extraction from text. Algorithms include Hidden Markov Models, Maximum Entropy Markov Models, and Conditional Random Fields. These methods are implemented in an extensible system for finite state transducers. ..

References in zbMATH (referenced in 15 articles )

Showing results 1 to 15 of 15.
Sorted by year (citations)

  1. Lim, Kar Wai; Buntine, Wray: Bibliographic analysis on research publications using authors, categorical labels and the citation network (2016)
  2. Costa-jussà, Marta R.; Grivolla, Jens; Mellebeek, Bart; Benavent, Francesc; Codina, Joan; Banchs, Rafael E.: Using annotations on mechanical turk to perform supervised polarity classification of Spanish customer comments (2014)
  3. Elloumi, Mourad; Zomaya, Albert Y.: Biological knowledge discovery handbook. Preprocessing, Mining and postprocessing of biological data (2014)
  4. Tellex, Stefanie; Thaker, Pratiksha; Joseph, Joshua; Roy, Nicholas: Learning perceptually grounded word meanings from unaligned parallel data (2014)
  5. Xu, Kaiquan; Liao, Stephen Shaoyi; Lau, Raymond Y.K.; Leon Zhao, J.: Effective active learning strategies for the use of large-margin classifiers in semantic annotation: an optimal parameter discovery perspective (2014)
  6. Das, Shubhomoy; Moore, Travis; Wong, Weng-Keen; Stumpf, Simone; Oberst, Ian; McIntosh, Kevin; Burnett, Margaret: End-user feature labeling: supervised and semi-supervised approaches based on locally-weighted logistic regression (2013)
  7. Zou, Jie; Le, Daniel; Thoma, George R.: Locating and parsing bibliographic references in HTML medical articles (2010)
  8. Biemann, Chris: Unsupervised part-of-speech tagging in the large (2009)
  9. Smith, M.; Giraud-Carrier, C.; Purser, N.: Implicit affinity networks and social capital (2009)
  10. Cesario, Eugenio; Folino, Francesco; Locane, Antonio; Manco, Giuseppe; Ortale, Riccardo: Boosting text segmentation via progressive classification (2008)
  11. Dietterich, Thomas G.; Hao, Guohua; Ashenfelter, Adam: Gradient tree boosting for training conditional random fields (2008)
  12. Rokach, Lior; Romano, Roni; Maimon, Oded: Negation recognition in medical narrative reports (2008)
  13. Michelson, Matthew; Knoblock, Craig A.: Unsupervised information extraction from unstructured, ungrammatical data sources on the world wide web (2007)
  14. Schneider, Norman Karl-Michael: Information extraction from calls for papers with conditional random fields and layout features (2007)
  15. Wei, Xing; Croft, Bruce; McCallum, Andrew: Table extraction for answer retrieval (2006)