Supporting Evidence-based Public Health Interventions using Text Mining


This project aims to conduct novel research in text mining and machine learning to transform the way in which evidence-based public health (EBPH) reviews are conducted. The project is a collaboration between three institutions:

  • The National Centre for Text Mining (NaCTeM), with its proven track record of developing effective text mining tools operating in a variety of domains.
  • The Machine Learning and Data Analytics (MaLDA) at the University of Liverpool, specialising in the application of machine learning, data mining and general mathematical modelling and optimisation methodologies to complex real-world problems.
  • The National Institute for Health and Care Excellence (NICE), the world's leading centre of the development and application of the principles of evidence-based medicine to technology appraisal, clinical guidelines and public health.

Project goals

  • to develop new text mining unsupervised methods for deriving term similarities, based on distributional semantics, to produce meaningful and high quality document and label clusters to support screen while searching in EBPH reviews.
  • to develop new seriation algorithms for ranking and visualising meaningful associations of multiple types, dynamically and iteratively.
  • to evaluate these newly developed methods in EBPH reviews, based on implementation of a pilot, to ascertain the level of transformation in EBPH reviewing.


RobotAnalyst is a new tool that builds upon state of the art text mining technologies, including topic modelling and feedback-based text classification models, to minimise the human workload involved in the study identification phase.


13th July 2017

NaCTeM is organising two workshops at the Global Evidence Summit, to be held in Cape Town, South Africa from 13th - 16th Sept 2017. The workshops will be entitled RobotAnalyst: an online system to support citation screening in evidence reviewing and Screening evidence for systematic reviews using a text mining system: the RobotAnalyst

12th July 2017

John McNaught gave a talk at the Text and Data Mining Symposium, held at the University of Cambridge on Wednesday 12th July 2017.

22nd June 2017

Prof. Sophia Ananiadou gave an invited talk at the University of Cambridge on 23rd June 2017, as part of the PulblicHealth@Cambridge series of seminars. The talk was entitled Text mining for public health reviews (The Robot Analyst).

5th October 2016

Prof. Ananiadou discussed the work carried out on this project during her participation in a panel session entitled Evidence Synthesis - Current Practices and Future Possibilities to be held as part of the IEEE International Conference on Healthcare Informatics (ICHI 2016), in Chicago, IL, USA.

6th June 2016

NaCTeM organised a workshop at the 24th Cochrane Colloquium, Seoul, Korea, to be held from 23rd -27th October 2016. The workshop was entitled Text mining methods to support the development of sensitive search strategies in public health reviews, and was organised in collaboration with the Public Health and Social Care Centre at the National Institute for Health and Care Excellence (NICE).

20th May 2016

The project is mentioned in a new article about text mining and the work of NaCTeM, published in Pharma Technology Focus, a bi-monthly magazine that brings together the latest insights and innovations from across the pharaceutical industry.

3rd October 2015

NaCTeM attended the 23rd Cochrane Colloquium, held in Vienna, Austria from the 3rd - 7th of October 2015 and co-organised a workshop session entitled The present and future use of text mining for study identification on the 5th of October.

Project information

The project is funded by the Medical Research Council for a period of 3 years, starting from 31st March 2014 (Grant No. MR/L01078X/1).

Project team

Prinicpal Investigator: Prof. Sophia Ananiadou (NaCTeM)

Co-Investigators: Mr. John McNaught (NaCTeM), Dr. John Goulermas (MaLDA).

Researchers: Dr. Austin Brockmeier (NaCTeM), Dr. Piotr Przybyla

