NaCTeM

LREC Workshop on Building & Evaluating Resources for Bio Text Mining - 26 May 2012, Istanbul

2012-05-04

THIRD WORKSHOP ON BUILDING AND EVALUATING RESOURCES FOR BIOMEDICAL TEXT MINING (BioTxtM 2012)

Saturday 26th May 2012
organised in conjunction with LREC2012 (Lütfi Kirdar Istanbul Exhibition and Congress Centre, Turkey)

http://www.nactem.ac.uk/biotxtm2012/

Registration: http://confreg.elda.org/

Over the past decade, biomedical text mining has received a large amount of interest. Faced with the rapidly increasing volume of biomedical literature, domain experts have an ever-increasing need for tools that can help them locate isolate relevant nuggets of information from this deluge of information in a timely and efficient manner. The response to such issues by the natural language processing community can be clearly evidenced in the biomedical natural language processing workshops that have been held over that past 10 years, in conjunction with ACL or NAACL meetings, to report the process in the field, as well as the founding of an ACL special interest group.

Biomedical text mining applications are reliant on high quality resources. These include databases and ontologies (e.g., Biothesaurus, UMLS Metathesaurus, MeSH and the Gene Ontology) and dictionaries/computational lexicons (e.g., the BioLexicon and the UMLS SPECIALIST lexicon). Recent years have also evidenced a large increase in the number of freely-available corpora (e.g., GENIA, GREC, AIMED, BioInfer, CRAFT, BioDRB) annotated with an expanding range of information types. These now include not only named entities and simple relations that hold between them, but also more complex event structures and coreference, as well as higher level information about how events are to be interpreted (e.g., facts, analyses, speculations, etc.) and discourse structure. Community shared tasks and challenges (e.g., JNLPBA, LLL05, Biocreative I/II/III, BioNLP’09, BioNLP 2011, i2b2, etc.) also normally involve the production of annotated corpora (on which the participating systems are trained and evaluated) as well as helping to steer research efforts to focus on open research problems.

Following on from the success of two previous workshops, the workshop aims to bring together researchers who make use of biomedical text mining resources such as the above in their applications, or who are working on the development of new resources. The workshop will allow an assessment of the current state of the art of resources, and will provide a forum for the discussion of current problems, questions and open issues, which will be useful in guiding further research in this area. Such topics are very much relevant to META-NET (a Network of Excellence consisting of 54 research centres from 33 countries), which is dedicated to building the technological foundations of a multilingual European information society. META-NET aims to push forward research to allow a rapid expansion of language technologies; such efforts can only be acheived if appropriate resources are available. A further vital consideration to allow rapid building of new applications is that of interoperability and reuse. As a step towards this, several annotated corpora have been made UIMA-compliant, and are available in the U-Compare system, which allows easy construction of NLP workfows and evaluation against gold standard corpora.

** Programme features an invited talk by Prof. Jun'ichi Tsujii, Microsoft Research Asia **

For further details, including the full programme for the workshop, please see the workshop website.

Previous itemNext item
Back to news summary page