Text Mining Resources
Evaluation
Evaluations of text mining systems are usually done by comparing their performance
on a common task using common data sets.Some of the most well-known events focus
on the biological domain:
- BioCreative
- an evaluation event for the text mining community applied to biology
- BioNLP Shared Tasks
- The BioNLP Shared Task (BioNLP-ST) series represents a community-wide trend in text-mining for biology toward fine-grained information extraction (IE). The two previous events, BioNLP-ST 2009 and 2011, attracted wide attention, with over 30 teams submitting final results. The tasks and their data have since served as the basis of numerous studies, released event extraction systems, and published datasets. The upcoming BioNLP-ST 2013 follows the general outline and goals of the previous tasks. It identifies biologically relevant extraction targets and proposes a linguistically motivated approach to event representation. The tasks in BioNLP-ST 2013 cover many new hot topics in biology that are close to biologists' needs. BioNLP-ST 2013 broadens the scope of the text-mining application domains in biology by introducing new issues on cancer genetics and pathway curation. It also builds on the well-known previous datasets GENIA, LLL/BI and BB to propose more realistic tasks that considered previously, closer to the actual needs of biological data integration.
- TREC Genomics Track
- TREC is an annual conference focusing
on the evaluation of information retrieval systems.The Genomics track specialises
on systems which help users to acquire genomics-related knowledge.