NaCTeM Software Tools
The National Centre for Text Mining bases its service systems on a number of text mining software tools.
- Part-of-speech (POS) taggers
- A part-of-speech tagger for English
- GENIA Tagger — Part-of-speech tagging for biomedical text (Web Service )
- Parsers
- Enju — A deep syntactic parser for English
- CFG Parser — A fast CFG parser for English
- GENIA Tagger — Shallow parsing for biomedical text. (Web Service )
- Named entitities/terms
- AnatomyTagger — an open-source entity mention tagger for anatomical entities
- Named-entity Recognizer — Part of the GENIA Tagger
- NEMine — Recognizes gene/protein names in text.
- Yeast MetaboliNER — Recognizes yeast metabolite names in text.
- ACELA — Tool for efficient annotation of named entitites
- Smart dictionary lookup — machine learning-based gene/protein name lookup
- Smart Dictionary Lookup Tool Web Service — Looks up term variations of a given gene/protein name based on an automatically trained similarity measure
- Term Normalization Tool — Normalizes terms with string rewriting rules automatically generated based on a dictionary.
- DECA — A species disambiguation system for biological named entities
- RF-TermAlign — a bilingual dictionary extraction tool that uses a Random Forest method to learn string similarity of terms between a source and target language.
- Other tools
- APLenty — An annotation tool for creating high-quality sequence labelling datasets using active and proactive learning
- Paladin — A document classification annotation web application which supports active/proactive learning.
- RobotAnalyst — A tool to minimise the human workload involved in the study identification phase of systematic reviews.
- EventMine — A machine learning-based event extraction system.
- brat — A free, open-source, web-based tool for text annotation visualisation and editing.
- Cafetiere — An easy-to-use text mining system for carrying text mining on your own document collection
- Sentence and paragraph breaker — An accurate sentence and paragraph detector based on heuristic rules
- Clinical Document Classification — automatic document classification demo
- Sentiment Analysis Tool — Analyses sentiment of input text.
Featured News
- Talk at Generative AI Summit
- Talk at Open Data Science Conference (ODSC)
- BioLaySumm 2023 - Shared Task @ BioNLP 2023
- Prof. Ananiadou appointed as Senior Area Chair for ACL 2023
- Recent funding successes for Prof. Sophia Ananiadou
- Junichi Tsujii awarded Order of the Sacred Treasure, Gold Rays with Neck Ribbon
Other News & Events
- Prof. Ananiadou gives talk as part of Women in AI speaker series
- New Knowledge Knowledge Transfer Partnership with 10BE5
- Keynote Talk at the Festival of AI
- New article on using neural architectures to aggregate sequence labels from multiple annnotators
- New article on improving biomedical extractive summarisation using domain knowledge