ProMiner: Scientific publications found in abstract data bases, full text journals or patents are the main and most up-to-date information source, but the amount of text is overwhelming for most life science areas. Recognition of life science terminology is a key prerequisite for performing automatic information retrieval and information extraction. Huge and complex terminologies with high numbers of synonymous expressions, ambiguous terminology and numerous generations of new names and classes present named entity recognition with a real challenge. ProMiner is a tool for specific terminology recognition and addresses several fundamental issues in named entity recognition in the field of life sciences: ProMiner can handle voluminous dictionaries, complex thesauri and large controlled vocabularies derived from ontologies; regularly updated dictionaries through automatic curation followed by a manualevaluation process; mapping of synonyms to reference names and data sources; context dependent disambiguation of biomedical termini and resolution of acronyms; specific handling of common English word synonyms; spelling variants of expressions in the source; dictionary can be recognized; high speed tagging and parallel workflow for multiple dictionaries; incorporation of regular expressions (e.g. for the recognition of SNP rs numbers); full text annotation in XML, HTML or PDF format; patent annotation.

References in zbMATH (referenced in 5 articles )

Showing results 1 to 5 of 5.
Sorted by year (citations)

  1. Griebel, Michael (ed.); Schüller, Anton (ed.); Schweitzer, Marc Alexander (ed.): Scientific computing and algorithms in industrial simulations. Projects and products of Fraunhofer SCAI (2017)
  2. Kolárik, Corinna; Klinger, Roman; Hofmann-Apitius, Martin: Identification of histone modifications in biomedical text for supporting epigenomic research (2009) ioport
  3. Wiegers, Thomas C.; Davis, Allan Peter; Cohen, K. Bretonnel; Hirschman, Lynette; Mattingly, Carolyn J.: Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD) (2009) ioport
  4. Yeniterzi, Süveyda; Sezerman, Osman Ugur: Enzyminer: automatic identification of protein level mutations and their impact on target enzymes from pubmed abstracts (2009) ioport
  5. Kim, Jin-Dong; Ohta, Tomoko; Tsujii, Jun’ichi: Corpus annotation for mining biomedical events from literature (2008) ioport