ELRA Distribution Agreement signed for BioLexicon
2009-09-28
ELRA, together with the National Centre for Text Mining (NaCTeM, University of Manchester, UK), the European Bioinformatics Institute (EBI, Hinxton, UK), and Istituto di Linguistica Computazionale-Consiglio Nazionale Ricerche (ILC-CNR, Pisa, Italy), has signed a Language Resources distribution agreement for a large-scale English language terminological resource in the biomedical domain: BioLexicon.
Biological terminology is a frequent cause of analysis errors when processing literature written in the biology domain, due largely to the high degree of variation in term forms, to the frequent mis-matches between labels of controlled vocabularies and ontologies on the one hand and the forms actually occurring in text on the other, and to the lack of detailed formal information on the linguistic behaviour of domain terms. For example, "retro-regulate" is a terminological verb often used in molecular biology but it is not included in conventional dictionaries. BioLexicon is a linguistic resource for the biology domain, tailored to cope with these problems. It contains information on:
- terminological nouns, including nominalised verbs and proper names (e.g., gene names)
- terminological adjectives
- terminological adverbs
- terminological verbs
- general English words frequently used in the biology domain
Existing information on terms was integrated, augmented, complemented and linked, through processing of massive amounts of biomedical text, to yield inter alia over 2.2M entries, and information on over 1.8M variants and on over 2M synonymy relations. Moreover, extensive information is provided on how verbs and nominalised verbs in the domain behave at both syntactic and semantic levels, supporting thus applications aiming at discovery of relations and events involving biological entities in text.
This comprehensive coverage of biological terms makes BioLexicon a unique linguistic resource within the domain. It is primarily intended to support text mining and information retrieval in the biomedical domain, however its standards-based structure and rich content make it a valuable resource for many other kinds of application.
Existing information on terms was integrated, augmented, complemented and linked, through processing of massive amounts of biomedical text, to yield inter alia over 2.2M entries, and information on over 1.8M variants and on over 2M synonymy relations. Moreover, extensive information is provided on how verbs and nominalised verbs in the domain behave at both syntactic and semantic levels, supporting thus applications aiming at discovery of relations and events involving biological entities in text.
This comprehensive coverage of biological terms makes BioLexicon a unique linguistic resource within the domain. It is primarily intended to support text mining and information retrieval in the biomedical domain, however its standards-based structure and rich content make it a valuable resource for many other kinds of application.
More information
For information about the Biolexicon, including references and information about how to obtain, please see our BioLexicon Page.
Previous item | Next item |
Back to news summary page |
Featured News
- Prof. Junichi Tsujii honoured as Person of Cultural Merit in Japan
- Participation in panel at Cyber Greece 2024 Conference, Athens
- Shared Task on Financial Misinformation Detection at FinNLP-FNP-LLMFinLegal
- New Named Entity Corpus for Occupational Substance Exposure Assessment
- FinNLP-FNP-LLMFinLegal @ COLING-2025 - Call for papers
- Keynote talk at Manchester Law and Technology Conference
- Keynote talk at ACM Summer School on Data Science, Athens
- Congratulations to PhD student Panagiotis Georgiades
Other News & Events
- Invited talk at the 8th Annual Women in Data Science Event at the American University of Beirut
- Invited talk at the 2nd Symposium on NLP for Social Good (NSG), University of Liverpool
- Invited talk at Annual Meeting of the Danish Society of Occupational and Environmental Medicine
- Advances in Data Science and Artificial Intelligence Conference 2024
- New review article on emotion detection for misinformation