Frequently Asked Questions

Which tagger should I use?

A part-of-speech tagger consists of an algorithm (e.g., Viterbi algorithm) on top of a model (e.g., Hidden Markov Model) typically constructed from a corpus (e.g., Penn-Treebank) and algorithm. Thus, these features determine the applicability and performance of a tagger. Currently, TerMine supports GENIA tagger ( and TreeTagger ( Trained on the GENIA, PennBioIE, and Wall Street Journal corpora, GENIA tagger is suitable for processing bio-medical text. TreeTagger is suitable for a general English text such as newspaper articles.