Seminar — Valia Kordoni
Speaker: | Dr. Valia Kordoni, German Research Centre for Artificial Intelligence (DFKI GmbH) and Dept. of Computational Linguistics, Saarland University |
Title: | Linguistic Knowledge Mining for Improving Performance of Lexicalised Grammars |
Date: | Thursday 8th July 2010 at 12:00 |
Location: | LG.010 in the MIB Building |
Abstract: | In the first part of the talk I focus on the linguistic properties of Multiword Expressions (MWEs), taking a closer look at their lexical, syntactic, as well as semantic characteristics. In the second part of the talk I focus on methods for the automatic acquisition of MWEs for robust grammar engineering. First I investigate the hypothesis that MWEs can be detected by the distinct statistical properties of their component words, regardless of their type, comparing various statistical measures, a procedure which leads to extremely interesting conclusions. I then investigate the influence of the size and quality of different corpora, using the BNC and the Web search engines Google and Yahoo. I conclude that, in terms of language usage, web generated corpora are fairly similar to more carefully built corpora, like the BNC, indicating that the lack of control and balance of these corpora are probably compensated by their size. I also show a qualitative evaluation of the results of automatically adding extracted MWEs to existing linguistic resources. I argue that the process of the automatic addition of extracted MWEs to existing linguistic resources improves qualitatively, if a more compositional approach to the grammar/lexicon automated extension is adopted. Finally, I conclude the talk with a brief presentation of an innovative chart mining technique which enhances even more the performance of the grammar by supporting a lexical acquisition model developed for a very challenging type of MWEs, namely, English verb particle constructions. The chart mining technique operates over unlexicalised features mined from a partial parsing chart. The proposed technique is shown to outperform a state-of-the-art parser over the target task, despite being based on relatively simplistic features.
|
Featured News
- ELLIS Workshop on Misinformation Detection - 16th June 2025
- 1st Workshop on Misinformation Detection in the Era of LLMs (MisD)- 23rd June 2025
- Prof. Sophia Ananiadou accepted as an ELLIS fellow
- Invited talk at the 15th Marbach Castle Drug-Drug Interaction Workshop
- BioNLP 2025 and Shared Tasks accepted for co-location at ACL 2025
- Prof. Junichi Tsujii honoured as Person of Cultural Merit in Japan
- Participation in panel at Cyber Greece 2024 Conference, Athens
- New Named Entity Corpus for Occupational Substance Exposure Assessment
Other News & Events
- CL4Health @ NAACL 2025 - Extended submission deadline - 04/02/2025
- Shared Task on Financial Misinformation Detection at FinNLP-FNP-LLMFinLegal
- FinNLP-FNP-LLMFinLegal @ COLING-2025 - Call for papers
- Keynote talk at Manchester Law and Technology Conference
- Keynote talk at ACM Summer School on Data Science, Athens