The Intute Project
The Intute project, co-funded by JISC (Joint Information Systems Committee) and AHRC (Arts and Humanities Research Council) , is a joint work between NaCTeM, MIMAS and the Intute Repository Search Project. The aim is to develop an intelligent semantic search service using NaCTeM's text mining tools, which will grant users the benefit of searching within an enhanced subset of the Intute repository, a collection of academic/technical reports under the domain-heading of Bio-medical Science or Social Science.
In particular, the Intute project considers four directions to improve the current search ability of Intute Repository Search:
- Enhancing the metadata using text mining technologies;
- Applying the technique(s) of text clustering/classification in the search system;
- Developing improved technique(s) for query expansion; and
- Involving the idea of personalisation in the search system.
Duration: May 1st, 2008 ~ April 30th, 2009
Principal Investigator: Dr. Sophia Ananiadou
Project Team (NaCTeM): Scott Piao and Brian Rea
Project Timetable
Project Flowchart
Project Documentation (Progress Reports & Presentations)
Progress of Project
1) Tools have been developed for indexing documents based on metadata (provided by UKOLN) and additional metadata generated by processing full texts. In particular, Genia POS tagger and Termine term extractor are integrated into the indexing package to extract terms from abstracts and pdf full-text documents (where available via the metadata) for indexing purpose. A sample index of over 197,000 documents, including about 3,500 full texts, has been created.
2) A demonstrator semantic document search package has been developed, in which advanced document searching functions are implemented, such as real time clustering of retrieved documents using Carrot2 package, term-based searching of similar and topic-sharing documents, complex query builing etc. In addition, the visualisation package Aduna has been integrated to graphically show the relationships between topics.
NaCTeM IRS Demo Site
Here is a video clip demonstrating the main functions of the NaCTeM IRS search demo site.
Click any of the screenshots below to access the demo site.
Figure 1: Simple search and cluster page:
![]() |
Figure 2: Full document information page:
![]() |
Figure 3: Document cluster visualisation page:
![]() |
Figure 3: Complex query builder page:
![]() |
Featured News
- ELLIS Workshop on Misinformation Detection - 16th June 2025
- 1st Workshop on Misinformation Detection in the Era of LLMs (MisD)- 23rd June 2025
- Prof. Sophia Ananiadou accepted as an ELLIS fellow
- Invited talk at the 15th Marbach Castle Drug-Drug Interaction Workshop
- BioNLP 2025 and Shared Tasks accepted for co-location at ACL 2025
- Prof. Junichi Tsujii honoured as Person of Cultural Merit in Japan
- Participation in panel at Cyber Greece 2024 Conference, Athens
- New Named Entity Corpus for Occupational Substance Exposure Assessment
Other News & Events
- CL4Health @ NAACL 2025 - Extended submission deadline - 04/02/2025
- Shared Task on Financial Misinformation Detection at FinNLP-FNP-LLMFinLegal
- FinNLP-FNP-LLMFinLegal @ COLING-2025 - Call for papers
- Keynote talk at Manchester Law and Technology Conference
- Keynote talk at ACM Summer School on Data Science, Athens