TnT -- Statistical Part-of-Speech Tagging. TnT, the short form of Trigrams’n’Tags, is a very efficient statistical part-of-speech tagger that is trainable on different languages and virtually any tagset. The component for parameter generation trains on tagged corpora. The system incorporates several methods of smoothing and of handling unknown words. TnT is not optimized for a particular language. Instead, it is optimized for training on a large variety of corpora. Adapting the tagger to a new language, new domain, or new tagset is very easy. Additionally, TnT is optimized for speed. The tagger is an implementation of the Viterbi algorithm for second order Markov models. The main paradigm used for smoothing is linear interpolation, the respective weights are determined by deleted interpolation. Unknown words are handled by a suffix trie and successive abstraction.

References in zbMATH (referenced in 16 articles )

Showing results 1 to 16 of 16.
Sorted by year (citations)

  1. Silva, Ana Paula; Silva, Arlindo; Rodrigues, Irene: An approach to the POS tagging problem using genetic algorithms (2014)
  2. Kornai, András: Probabilistic grammars and languages (2011)
  3. Ponzetto, Simone Paolo; Strube, Michael: Taxonomy induction based on a collaboratively built knowledge repository (2011)
  4. Rupnik, Jan; Grčar, Miha; Erjavec, Tomaž: Improving morphosyntactic tagging of Slovene language through meta-tagging (2010)
  5. Agić, Željko; Dovedan, Zdravko; Tadić, Marko: Improving part-of-speech tagging accuracy for Croatian by morphological analysis (2009)
  6. Biemann, Chris: Unsupervised part-of-speech tagging in the large (2009)
  7. Carl, Michael; Melero, Maite; Badia, Toni; Vandeghinste, Vincent; Dirix, Peter; Schuurman, Ineke; Markantonatou, Stella; Sofianopoulos, Sokratis; Vassiliou, Marina; Yannoutsou, Olga: METIS-II: Low resource machine translation (2008)
  8. Ramakrishnan, Ganesh; Joshi, Sachindra; Balakrishnan, Sreeram; Srinivasan, Ashwin: Using ILP to construct features for information extraction from semi-structured text (2008)
  9. Saquete, E.; Ferrández, O.; Ferrández, S.; Martínez-Barco, P.; Muñoz, R.: Combining automatic acquisition of knowledge with machine learning approaches for multilingual temporal recognition and normalization (2008)
  10. Filippova, Katja; Strube, Michael: The German Vorfeld and local coherence (2007)
  11. Alba, Enrique; Luque, Gabriel; Araujo, Lourdes: Natural language tagging with genetic algorithms (2006)
  12. Crego, Josep Maria; Mariño, José B.: Improving statistical MT by coupling reordering and decoding (2006)
  13. Shen, Hong; Sarkar, Anoop: Voting between multiple data representations for text chunking (2005)
  14. Cohen, K.Bretonnel; Hunter, Lawrence: Natural language processing and systems biology (2004)
  15. Tufiş, Dan; Barbu, Ana Maria: Revealing translators’ knowledge: Statistical methods in constructing practical translation lexicons for language and speech processing (2002)
  16. Brants, Thorsten: Estimating hidden Markov model topologies (1998)