Thalia (Text mining for Highlighting, Aggregating and Linking Information in Articles) is a semantic search engine that can recognise concepts occurring in biomedical abstracts indexed on Pubmed. It currently recognises eight types of concepts, namely: chemicals, diseases, drugs, genes, metabolites, proteins, species and anatomical entities.

User interface

Thalia is available through a web-based user interface at the following address: More information on the system can be obtained by reading the manual or watching a short demonstration video:


Thalia can also be queried through a RESTful API. For more information, read the API manual. The queries should be sent to the following address:

Project Team

Principal Investigator: Sophia Ananiadou
Researchers: Axel Soto, Piotr Przybyła


This work has been supported by BBSRC, Enriching Metabolic PATHwaY models with evidence from the literature (EMPATHY) [Grant ID: BB/M006891/1], and The Manchester Molecular Pathology Innovation Centre (MMPathIC) [Grant ID: MR/N00583X/1].